Describing the File Formats


Describing the File Formats

With EDI formats we reach the peak of complexity in describing file formats. In addition to describing the grammar of segments and transaction sets, we also need information about various delimiters and values for things like sender and receiver IDs.

It should come as no surprise that our X12 file description document has three major sections, each represented by an Element that is an immediate child of the root Element.

  • PhysicalCharacteristics : the X12 interchange characteristics, required only when converting to X12

  • XMLOutputCharacteristics : the XML output characteristics, required only when converting to XML

  • Grammar : the transaction set grammar

X12 File Physical Characteristics

An interchange's physical characteristics are described in the PhysicalCharacteristics Element. This Element is required for only the target conversion utility. The relevant information is extracted directly from the input interchange for the source conversion utility.

Table 9.3 shows the child Elements of the PhysicalCharacteristics Element. All are required unless otherwise noted.

Note that a few things are missing from the child Elements and Attributes defined for the PhysicalCharacteristics Element.

  • Interchange control standards identifier : Prior to version 4020 of X12 this was the ID value sent in ISA11. The only value ever approved was a U, so I hard-code it in the utility.

  • Interchange and group control numbers : These come from a separate source as will be discussed in a later section.

  • Interchange and group date and time stamps : These are generated by the XMLToX12 utility at runtime.

Although the Repetition Separator and repeating data elements were not introduced until version 4020, I include the delimiter here for future compatibility.

XML Output Characteristics

Characteristics governing the output XML documents are described in the XMLOutputCharacteristics Element. This Element is used only when converting from X12 interchanges to XML.

Table 9.4 shows the child Element of the XMLOutputCharacteristics Element. This time we have only a single child Element; as with the other file formats the schema reference is optional.

Table 9.3. Child Elements of the PhysicalCharacteristics Element

Child Element

Child Element

Attribute

Schema Data Type

Description

Allowable Values, Restrictions, or Comments

Delimiters

SegmentTerminator

value

union of a single token character and hexBinary

The segment terminator character expressed as a literal character or a hexadecimal value

A single nonwhitespace character or a two-character hexadecimal number from 00 through FF representing a single byte

 

ElementSeparator

value

union of a single token character and hexBinary

The element separator character expressed as a literal character or a hexadecimal value

A single nonwhitespace character or a two-character hexadecimal number from 00 through FF representing a single byte

 

ComponentSeparator

value

union of a single token character and hexBinary

The component data element separator character expressed as a literal character or a hexadecimal value

A single nonwhitespace character or a two-character hexadecimal number from 00 through FF representing a single byte

 

RepetitionSeparator

value

union of a single token character and hexBinary

The repetition separator character expressed as a literal character or a hexadecimal value

Optional, not used for X12 versions prior to 4020; a single nonwhitespace character or a two-character hexadecimal number from 00 through FF representing a single byte

ISAInformation

AuthorizationQualifier

value

string

Populates ISA01

May be spaces

 

AuthorizationInfo

value

string

Populates ISA02

May be spaces

 

SecurityQualifier

value

string

Populates ISA03

May be spaces

 

SecurityInfo

value

string

Populates ISA04

May be spaces

 

SenderIDQualifier

value

token

Populates ISA05

 
 

SenderID

value

token

Populates ISA06

Trailing spaces not permitted; will be inserted by the utility if required

 

ReceiverIDQualifier

value

token

Populates ISA07

 
 

ReceiverID

value

token

Populates ISA08

Trailing spaces not permitted; will be inserted by the utility if required

 

VersionNumber

value

token

Populates ISA12

Will normally be 00401 for version/release 004010 of X12

 

TestIndicator

value

token

Populates ISA15

P for production or T for Test

GSInformation

FunctionalIDCode

value

token

Populates GS01

 
 

ApplicationSenders-Code

value

token

Populates GS02

 
 

ApplicationReceivers-Code

value

token

Populates GS03

 
 

ResponsibleAgency-Code

value

token

Populates GS07

X for X12 or T for TDCC

 

VersionRelease

value

token

Populates GS08

The X12 version and release and/or an industry identifier code

STInformation

TransactionSetIDCode

value

token

Populates ST01

 
Table 9.4. Child Element of the XMLOutputCharacteristics Element

Child Element

Attribute

Schema Data Type

Description

Allowable Values, Restrictions, or Comments

SchemaLocationURL

value

anyURI

URL of the schema file for the output document; will be written as the value of the root Element's noNamespace-SchemaLocation Attribute

The Element is optional. If not specified the noNamespace-Schema Location Attribute will not be written. An error will occur if output validation is requested and this Element is not present.

Transaction Set Grammar

The grammar of X12 transaction sets is the most complex grammar we'll deal with in this book. In addition to having a complex grammar of segments, composite data structures, and data elements, we also have defined segment groups.

It is important to note again here that, unlike some approaches to converting X12 transaction sets to XML, I'm not doing an exact isomorphic representation of a physical X12 transaction set. I'm doing a logical representation of an X12 transaction set. Specifically, segment groups aren't visible as such in a transaction set, and while we can certainly identify composite data structures by their delimiters, they aren't necessarily highly visible as logical units. I show both structures explicitly as XML Elements. A segment group has its member segments (and groups, if any) shown as children, and a composite data structure is depicted as an XML Element with its component data elements as child XML Elements. This is the same approach depicted in the figures in Chapter 8 except that we add another layer for the composite data structures.

If you're familiar with X12 EDI, you may think there are a few things missing from the grammar. If we were concerned with strict X12 compliance, you would be correct. However, we aren't. Our primary concern is transforming data in an X12 format into and out of an XML representation. As such, we don't need to concern ourselves with constraints such as mandatory and optional designations, code values, maximum occurrences, or even whether or not a data element is allowed to repeat. We will rely on schema validation of the XML representation to enforce these types of constraints.

As in previous chapters, the High-Level Design Considerations section contains a more detailed analysis and discussion of X12 grammar. Table 9.5 shows the details of the Grammar Element and its child Nodes. All are required unless noted. A few things should be noted about the table. The Grammar Element reflects an implementation guide, or a usage, of an X12 transaction set standard and not that full standard. If a construct is defined in any level of the X12 transaction set standard but not used in the particular implementation, the corresponding Grammar Element is not required. For example, if an SAC loop for special allowances and charges is defined for the transaction set standard but not used in the implementation, the corresponding GroupDescription Element is not required. If the fourth data element of the BEG beginning segment for the 850 Purchase Order is not used in an implementation, the SimpleElementDescription for BEG04 is not required in the SegmentDescription Element for the BEG. The indentation in the Element column shows the approximate hierarchical relationships. The Allowable Child Elements column lists the specific details of the hierarchy.

NOTE Do Not Include ST and SE Segments

Although ST Transaction Set Header and SE Transaction Set Trailer segments are part of the published standards for transaction sets, they must not be included in the Grammar Element. The utilities consider them control segments and derive their definitions elsewhere.

Table 9.6 shows the X12 data types defined in version 004010 of the X12.6 standard. For the X12 utilities we support only those data types defined in the X12 standards and don't support all the data types we developed in previous chapters. I do, however, show the correspondence to the other Babel Blaster data types. The correspondence to schema language data types is somewhat approximate in many cases.

Again, I include Truncatable as an optional Attribute for all target X12 simple data elements. This is not part of the X12 standard but is a feature of the XMLToX12 utility. For all types, a runtime error occurs if Truncatable is false and the length of the XML Element contents exceeds the field length. If, on the other hand, you want to allow truncation , you can include the Attribute in the SimpleDataElement Element and set the value to true.

Example File Description Documents

The file description documents for the X12 850 Purchase Order and 810 Invoice examples are available from the book's Web site as:

  • PurchaseOrderX12SourceDescription.xml

  • InvoiceX12TargetDescription.xml

Table 9.5. Transaction Set Grammar Characteristics in the Grammar Element

Element

Allowable Child Elements

Attribute

Schema Language Data Type

Description

Allowable Values, Restrictions, or Comments

Grammar

SegmentDescription, GroupDescription

   

Describes the grammar of both the transaction set and the corresponding XML representation.

The Grammar Element may have any combination of Segment-Description or GroupDescription Elements as children.

   

ElementName

NMTOKEN

Specifies the name of the document's root Element.

When creating XML documents, the specified name is assigned to the document's root Element. When creating an X12 interchange, the input XML document's root Element must match this name. Maximum length reflects restriction on the length of Element names .

   

TagValue

token

Specifies the segment identifier of the transaction set's beginning segment.

Note : This is the first segment expected in the implementation following the ST segment.

GroupDescription

SegmentDescription, GroupDescription

   

Describes a segment group.

One GroupDescription Element is required for each position at which the segment group might appear in the implementation. The first child Element must be a Segment-Description Element. It may be followed by any combination of SegmentDescription or Group Description Elements.

   

ElementName

NMTOKEN

Specifies the name of the Element representing the group.

Maximum length reflects restriction on the length of Element names.

   

TagValue

token

Specifies the segment identifier of the group's beginning segment.

 

SegmentDescription

CompositeStructure-Description, SimpleElement-Description

   

Describes the grammar of an individual X12 segment and the corresponding XML representation.

One Segment Description Element is required for each position at which the segment might appear in the implementation. One or more Simple ElementDescription or CompositeStructure Description Elements are required to specify the data elements used within the segment.

   

ElementName

NMTOKEN

Specifies the name of the Element representing a segment.

Maximum length reflects restriction on the length of Element names.

   

TagValue

token

Specifies the value of the segment identifier as specified in the X12 standard.

 

CompositeStructure-Description

SimpleElement-Description

   

Describes the grammar of an X12 composite data structure and the corresponding XML representation.

One CompositeStructure- Description Element is required for each position in a segment where the composite data structure might appear. One or more Simple-Element Description Elements are required to specify the X12 component data elements used within this composite.

   

ElementName

NMTOKEN

Specifies the name of the XML Element representing the data element.

Maximum length reflects restriction on the length of Element names.

   

FieldNumber

positive-Integer

Specifies the data element position within the segment.

Maximum value reflects restriction on the maximum number of data elements per segment.

SimpleElement-Description

None

   

Describes the characteristics of an X12 simple data element and the corresponding XML representation.

One SimpleElement Description Element is required for each position in a segment or composite data structure where the simple data element might appear.

   

ElementName

NMTOKEN

Specifies the name of the XML Element representing the data element.

Maximum length reflects restriction on the length of Element names.

   

FieldNumber

positive-Integer

Specifies the data element position within the segment.

Maximum value reflects restriction on the maximum number of data elements per segment.

   

SubField-Number

positive-Integer

Specifies the component data element position within the composite data structure.

Optional; this Attribute is required only when the data element is part of a composite data structure. It has a default value of zero if missing. Maximum value reflects restriction on the maximum number of component data elements per composite data structure.

   

DataType

token

Specifies the X12 data type of the data element.

See Table 9.6.

   

MinLength

nonNegative-Integer

Specifies the minimum length for the data element, as specified in the standard or implementation guide.

 
   

MaxLength

positive-Integer

Specifies the maximum length for the data element, as specified in the standard or implementation guide.

 
   

Truncatable

boolean

Indicates whether or not truncation is permitted. See comments regarding truncation in Table 9.6.

Optional, defaults to false.

Table 9.6. X12 Data Types

X12 Data Type

Grammar Data Type Code

Schema Language Data Type

Actions with X12 as Source

Actions with X12 as Target

Action with X12 as Target if Truncatable Is True

Comments and Restrictions

Numeric, Nn

X12-Nx, where x represents the number of implied decimal places (corresponds to Nx)

Decimal

Leading zeroes are removed. All whitespace is trimmed .

The number is rightjustified within the field. If the number of fractional digits in the source decimal number exceeds x, the number is right-truncated to x fractional digits. Zeroes are added as fractional digits if the source number has fewer than x fractional digits. Leading zeroes are added if the source is shorter than the target. If a minus sign character is present in the source, it appears in the left-most position in the target. The plus sign is not converted.

Ignored, not truncatable.

 

Decimal Number, R

X12-R (corresponds to R)

Float or double

Leading zeroes are removed. All whitespace is trimmed.

The number is right-justified within the field. Leading zeroes are added if the source is shorter than the target. If a minus sign character is present in the source, it appears in the left-most position in the target. The plus sign is not converted.

Fractional digits to the right of the decimal point are truncated until the Element contents are equal to the field length. An error occurs if digits to the left of the decimal exceed the field length.

The X12 decimal number data type supports base 10 exponential notation using an uppercase E followed bythe exponent. The schema language float and double data types also allow a lowercase e, which we convert to an uppercase E. The schema language data type allows the special values ”0, INF, -INF, and NaN for minus zero, positive infinity, negative infinity, and "not a number," respectively. These are not converted and force an error if present since there are no equivalent X12 values.

Identifier, ID

X12-ID

token, constrained with enumeration facets

Leading and trailing whitespace (any character with an integer value less than or equal to a space character) is trimmed. All other whitespace within the string is preserved.

If the source is shorter than the minimum length, the data is left-justified and filled to the right with spaces. Leading and trailing whitespace is trimmed. If trimming results in a single space character, the data element becomes empty (with exceptions for ISA data elements).

Not truncatable.

Identifier data elements may either have their values enumerated in the standard or reference an external code list. While it is generally the rule that neither leading nor trailing spaces are present, trailing spaces are allowed if required to satisfy minimum length requirements. ID data elements with only spaces are allowed in the ISA segment.

String, AN

X12-AN (corresponds to AN)

string

Leading and trailing whitespace (any character with an integer value less than or equal to a space character) is trimmed. All other whitespace within the string is preserved.

If the source is shorter than the minimum length, the data is left-justified and filled to the right with spaces. Leading and trailing whitespace is trimmed. If trimming results in a single space character, the data element becomes empty.

The string is right-truncated to the field length.

X12.6 says that leading spaces are considered significant. However, in more than ten years of working with EDI I've never seen them used. We're going to trim them if we find any.There is another significant difference between the data types. In the X12 there must be at least one nonspace character, while a single space character in XML is schema valid.

Date, DT

X12-DT

date

For six-digit dates the prefix 20 is added for hundred years.

The thousand and hundred years are omitted for six-digit dates.

Ignored, not truncatable.

The X12 committee sort of punted when it came to defining dates. Both YYYYMMDD (or CCYYMMDD if you pre -fer) and YYMMDD are supported in the X12.6 definition. However, most date data elements, such as 373 Date, use YYYYMMDD since they have both minimum and maximum lengths of 8.

Time, TM

X12-TM

time

Seconds, if absent, are added. Fractional seconds, if specified, are preserved.

N/A

Fractional seconds and seconds are truncated to satisfy maximum length requirements.

 

Binary, B

         

Not supported.

The concepts and appearance are very similar to what we saw in previous chapters, so I'm not inserting the file description documents for the X12 utilities into the book. However, there are a few very important points that need to be made. These particular file description documents reflect usage as defined in an implementation and not the full standard. But, again in keeping with our strategy of using schema validation as the primary mechanism for enforcing business rules, you won't see any mandatory, optional, maximum count, or code list information in these definitions. If you wish to enforce these types of constraints for a particular implementation, you can develop a schema for the XML representation of the transaction set.

This aspect of the file description documents leads to another observation about EDI conversions. The file description documents describe transaction set grammar only to a level sufficient for an XML representation. While these two examples describe only the segments and data elements that appear in these particular implementations, there would be no harm if they described items that aren't used. These unused groups, segments, and data elements in the grammar would simply be ignored during processing. For this reason it would be possible to develop a file description document directly from the standard for a transaction set that covers all possible implementations of the transaction set. So, while we would probably want schemas that validated particular implementations, we could conceivably have only one file description document for each version and release of a transaction set.