Record Reformatter
Do you have
a need to restructure records ?
Record Reformatter is ideal for moving
data between different systems, or to produce simple reports
based on complex file structures. The restructuring is often
used to expand packed numeric fields, such as COMP-3, or binary/floating
point numbers. It can also handle floating point numbers from
a VAX system.It will also read IBM type records on your PC.
Other applications include extracting
fields from a data base for indexing, or summary applications.
It can also be used to do an intelligent ASCII/EBCDIC conversion
on data where there is a mixture of text and binary fields.
The Record Reformatter has many tools
built in to assist the user in analysing unknown records and
field structures. This includes calculating record lengths by
looking for patterns in the data, and showing data in several
formats, allowing the user to select a sensible conversion.
Where Cobol data descriptions are available
in a file form, this can be read and the record structure determined
automatically. In a similar way, AS400 savelib tapes may be
read with MMPC to produce a record reformatter file description.
The date conversion routines, and the
general restructuring of records may assist in converting data
files in applications requiring updates due to the year 2000
(Y2K) problem.
The Record Reformatter may be used as
a stand alone tool, or in conjunction with InterMedia for Windows,
or MMPC. and as such provides a very power method of
handling many IBM type tapes and records.
About Records
Before details of how to use the Record
Reformatter are explained, a few notes on record structure and
definitions are required. As with all applications there will
be exceptions that can not be handled, but in practice, these
will not be very common.Records can be fixed length or variable
length
Fixed length records
A fixed length record always has the
same length, but a file may be made up of many different types
of record, which may each have different lengths.
The record reformatter will try and determine
the record length automatically, even for records without carriage
returns. The user can over ride any automatic record length
determination.
A record is defined by unique codes in
a fixed location, irrespective of the record length. For instance,
the first 2 characters may be a two digit number which determines
the type of record. Up to 10 different types of record may be
defined, each with different record lengths and field structures.
Fields are fixed lengths and locations
within a record. Different record types, will have different
fields, and field positions.
Before any work may be performed on records,
it is therefore essential to define the following points
Types of input records
- Code identifier position and length
- Each record length
- Field structure for each record type
- Length of output record it translates
to.
Output records are defined
by
- Length
- Fields
- Types of conversion between input
an output
- Value of filler for unmapped fields
All these values are entered via the
Record Reformatter Editor. In analysis mode field types can
be automatically determined. Due to the nature of text based
records, this initial analysis will not be always correct, so
full editing of the first approximation may be achived later.
Alternatively, the structure may be entered manually, or from
a Cobol Data definition file.
Variable Length Records
With fixed length records, each field
and record has a pre-determined length. With variable length
records, fields and records are marked by end characters. A
typical marking would be fields are separated by a comma, and
records by a CR. Thus two records could like as below
field1,field2,A longer field,last field(CR)
fa,fb,fc,final field in this record(CR)
Thus to define such a file structure
it is necessary to assign values for field and record delimiters.
With the Record reformatter, both of these delimiters may be
one or two characters long.
With variable length fields, only a single
record structure can be handled.
Automatic Analysis Mode
A very powerful feature of RR32 is the
ability to analyse records and create a outline input field
definition automatically. Although this is not a complete substitute
for documentation describing a record, it can be extremely useful
in analysing an otherwise unknown record.
Analyse will try and determine field
breaks, as well as field types. This includes text (ASCII or
EBCDIC) and packed fields. The automatic field breaks may be
moved, added or deleted. The generated routine may then be edited
in any way required. This can include adding fields, deleting,
concatenating and sorting field definitions.
Output output creation and
testing
Once an input record has been defined
a typical application is to convert it into say a quote comma
quote delimited record. The output can be automatically generated
based on the input field locations and descriptions. Several
typical output types may be created.
The output may then be edited as required
so that lengths can be controlled or modified. In the same way
fixed data fields may be added.
Specification of Features
| Maximum number of record types |
20 |
| Maximum number of field definitions |
3000 |
| Maximum file length |
Only limited by disk space |
| Record Analysis |
First 50 records |
| Setup wizard |
Yes |
| Win95/98 |
Yes |
| Win NT 4.0 |
Yes |
| Win 3.x |
No |
Field Conversions
Each field may be converted by the following
commands. The position of each output field is entirely dependant
on the user, and fields may be omitted or included multiple
times as required.
Copy
This is probably the most common conversion
rule to be used. It very simply copies the input field to the
output field. If the input and output fields are different lengths,
either the end of the field will be truncated, or padded with
spaces.
Copy Reverse
This will copy the field, as described
above, and then reverse it. Thus a field such as
"Hello 1234"
will become
"4321 olleH"
The reversing works on the final length
of the output field, so if padding was required, the padding
would end up at the start of the field.
For fields of 1,2,4 bytes in length,
this operation is identical to Swap 8, Swap 16 and Swap 32.
Copy Pascal
This will copy a Pascal text string.
A Pascal string starts with a byte giving the length, followed
by the string. The length byte is stripped when copying
ASCII-EBCDIC
As in Copy but converts an ASCII input
field to EBCDIC
EBCDIC-ASCII
As in copy but converts an EBCDIC field
to ASCII
FILL ASCII
This fills the output field with the
ASCII string defined in the parameters. Any codes in <>
are treated as hex values. Example of string is 12<09>Hello.
It may typically be used for inserting tags between fields,
or even a ',' to make a record ',' delimited
FILL HEX
This is the same as Fill ASCII but allows
the user to insert non printing characters.
For example to insert a CRLF, the output
string
0D0A
would be used.
It may also be used to insert EBCDIC
characters
ASCII-PACKED
This converts an ASCII number to a packed
field. This takes numbers and uses nibbles (4 bits) to represent
the number, thus a number '1234', would be in hex, 31 32 33
34, and this would be converted to 01 23 4C, where C represents
+. A D would represent -, and F, unsigned. This method of storing
numbers effectively compresses the space required by a factor
of 2, and is common within many IBM based record structures.
It is also known as IBM COMP-3
EBCDIC-PACKED
This converts an EBCDIC number to a packed
field. (See ASCII-PACKED)
Packed to ASCII
Packed to EBCDIC
Packed fields are a very common occurrence
in many (IBM) records. The numbers may be signed, as above,
or unsigned, in which case a series of hex characters 12 34
56 would represent the decimal number 123456.
Convert Date
The convert date operator will convert
a date field to a DDMMYYYY date format.
The actual output date format is selected
on the configuration screen of the routine.
The type of date conversion is dependant
on the combo box at the right of the line.
Conversion options are as below
- Date 7-4-5. This relates to the bits
of a 16 digit number, where the most significant bits represent
the year, from 1900 to 2027. The next 4 digits are the month,
1-12, and the final 5 bits, the day 1-31.
- DATE YYMMDD This inserts system
date as YYYY/MM/DD
- DATE MMDDYY This inserts system
date as MM/DD/YYYY
- DATE DDMMYY This inserts system date
as DD/MM/YYYY
TIME
Inserts system time in output string,
as HH:MM:SS
SWAP 32
Swaps 4 byte arrays. This can be useful
to convert numbers from little endolian to big endolian
SWAP 16
Swaps two characters from the input string
example
Input = InterMedia
output = nIetMrdeai
SWAP 8
Swaps two nibbles from an input byte.
For example, 0D(hex) would be converted to D0(hex)
Record Count
This inserts the current record count
+/- Number HiLo
+/- Number LoHi
Vax Float
This converts the input binary number
to an ASCII string. The output buffer is right justified, and
if not large enough the most significant digits will be truncated.
If the value is negative, a '-' sign will be added. This conversion
feature can be extremely important when trying to import binary
files into a text file format
The range of numbers is as below
Digits Output buffer size
| Input Digit |
Output buffer size |
| 1 |
4 |
| 2 |
6 |
| 3 |
8 |
| 4 |
11 |
| 8 |
38 max |
| 10 |
200 (not yet implemented) |
| |
|
For Vax floating point numbers there
are 4 defined lengths,
F-Float 4 bytes
D-Float 8 Bytes
G-Float 8 Bytes (Not implemented yet)
H-Float 16 Bytes (Not implemented yet)
All Vax numbers are signed, and the ordering
is fixed
The size of the input number will be
taken from the input field definition, and may be 1-4 bytes
in length. The ordering of the number will be high byte first
for HiLo, and low byte first for LoHi.
For floating point numbers (8 characters
in length) the output is almost unlimited. If the output is
longer than the field allowed for, the number will be displayed
in scientific notation, eg 2.63E5. If the output from a floating
point number contains invalid characters, this is most likely
one of two reasons,
a) It is not a floating point number
b) The order should be swapped, i.e.
HiLo, or LoHi
The 6 digit character is a special floating
point implementation. It is not known how
standard this is. The 6 byte array is
as below
Byte 1 mantissa
Byte 2-6 Exponent in Lo-Hi ordering
The exponent is in the range of 0.5 -
1.5
The mantissa is a multiple of 2
Number HiLo
Number LoHi
This is as above, but the number is not
signed, and so the output buffer can be one character shorter
Cobol Num
This rule will convert signed strings
from Cobol systems. The string includes it's sign as part of
the last digit. The output may also be formatted with the same
commands as described below in Formatting Numeric fields.
Formatting Numeric fields
Numeric fields have an extra edit field
at the right of the screen. This is to allow for formatting
of the output. By using this the number of decimal places may
be determined and leading zeros displayed or suppressed.
The options are extremely variable and
the command line is in the structure below
£#,4.2
where the symbols are as below
If a £, or $ sign is shown this means
that the money sign is added in front of the
number.
If # is set then leading zeros are displayed
If a , is in the line, then significant
numbers are broken down into groups of 3, separated by a comma,
such as
1,234,567.12
The final number is the number of significant
and decimal places. Thus 3.2 would be 3 leading digits and 2
decimal places.
If the field is left blank, then no numeric
formatting will take place.
If the output field is too short, then
significant t digits are truncated.
Some examples :
£3.4 £100.1234
#6.2 0000012.99
#,6.2 123,456.55
,8.2 1,435.55
5.0 43241
TransTab (InterMedia for Windows
only)
The TransTab option is the same of 'Copy
Field', but an IMW translation table may be applied. The translation
table is any IMW table and it performs a complete string and
byte translation. Thus the output string may be longer than
the input string. If too long for the output field, it will
be truncated (from the right).
Typical applications could be case mapping
(make all lower case, or all upper case) or handling accented
characters, or different EBCDIC conversions.
There is a limitation of a maximum number
of 8 different translation tables definable within a single
record reformatter table. There is no limit on the number of
fields that may be converted
IB Field
This will insert and IntelliBase / IntelliBase
95 field marker. The parameter should be a number between 1
and 9999. The output data is always a 4 digit number, preceded
by a 0EH, and followed by a 0FH. The length should always be
6