2. Brief Overview

Flavor provides a formal way to specify how data is laid out in a serialized bitstream. It is based on a principle of separation between bitstream parsing operations and encoding, decoding and other operations. This separation acknowledges the fact that different tools can utilize the same syntax, but also that the same tool can work unchanged with different bitstream syntax. For example, the number of bits used for a specific field can change without modifying any part of the application program.

Past approaches for syntax description utilized a combination of tabular data, pseudo-code, and textual description to describe the format at hand. Taking MPEG as an example, both MPEG-1 and MPEG-2 specifications were described using C-like pseudo-code syntax (originally introduced by Milt Anderson, Bellcore), coupled with explanatory text and tabular data. Several of the lower and most sophisticated layers (e.g., macroblock) could only be handled by explanatory text. The text had to be carefully crafted and tested over time for ambiguities. Other specifications (e.g., GIF and JPEG) use similar bitstream representation schemes, and hence share the same limitations.

Other formal facilities already exist for representing syntax. One example is ASN.1 (ISO International Standards 8824 and 8825). A key difference, however, is that ASN.1 was not designed to address the intricacies of source coding operations, and hence cannot cope with, for example, variable-length coding. In addition, ASN.1 tries to hide the bitstream representation from the developer by using its own set of binary encoding rules, whereas in our case the binary encoding is the actual target of description.

There is also some remote relationship between syntax description and "marshalling," a fundamental operation in distributed systems where consistent exchange of typed data is ensured. Examples in this category include Sun's ONC XDR (External Data Representation) and the rpcgen compiler that automatically generates marshalling code, as well as CORBA IDL, among others. These ensure, for example, that even if the native representation of an integer in two systems is different (big versus little endian), they can still exchange typed data in a consistent way. Marshalling, however, does not constitute bitstream syntax description because: 1) the programmer does not have control over the data representation (the binary representation for each data type is predefined), 2) it is only concerned with the representation of simple serial structures (lists of arguments to functions, etc.). As in ASN.1, the binary representation is "hidden" and is not amenable to customization by the developer. One could parallel Flavor and marshalling by considering the Flavor source as the XDR layer. A better parallelism would be to view Flavor as a parser-generator like yacc [11], but for bitstream representations.

It is interesting to note that all prior approaches to syntactic description were concerned only with the definition of message structures typically found in communication systems. These tend to have a much simpler structure compared with coded representations of audio-visual information (compare the UDP packet header with the baseline JPEG specification, for example).

A new language, Bitstream Syntax Description Language (BSDL) [12, 13], has recently been introduced in MPEG-21 [14] for describing the structure of a bitstream using XML Schema. However, unlike Flavor, BSDL is developed to address only the high-level structure of the bitstream, and it becomes almost impossible to fully describe bitstream syntax on a bit-per-bit basis. For example, BSDL doesn't have a facility to cope with variable-length coding, whereas in Flavor, map (described in Section 3.5) can be used. Also, the BSDL description would be overly verbose, requiring a significant effort to review and modify the description with the human eye. More detailed information about BSDL is given in Section 4.7.2.

Flavor was designed to be an intuitive and natural extension of the typing system of object-oriented languages like C++ and Java. This means that the bitstream representation information is placed together with the data declarations in a single place. In C++ and Java, this place is where a class is defined.

Flavor has been explicitly designed to follow a declarative approach to bitstream syntax specification. In other words, the designer is specifying how the data is laid out on the bitstream, and does not detail a step-by-step procedure that parses it. This latter procedural approach would severely limit both the expressive power as well as the capability for automated processing and optimization, as it would eliminate the necessary level of abstraction. As a result of this declarative approach, Flavor does not have functions or methods.

A related example from traditional programming is the handling of floating point numbers. The programmer does not have to specify how such numbers are represented or how operations are performed; these tasks are automatically taken care of by the compiler in coordination with the underlying hardware or run-time emulation libraries.

An additional feature of combining type declaration and bitstream representation is that the underlying object hierarchy of the base programming language (C++ or Java) becomes quite naturally the object hierarchy for bitstream representation purposes as well. This is an important benefit for ease of application development, and it also allows Flavor to have a very rich typing system itself.

"HelloBits"

HelloBits - Traditionally, programming languages are introduced via a simple "Hello World!" program, which just prints out this simple message on the user's terminal. We will use the same example with Flavor, but here we are concerned about bits, rather than text characters. Figure 4.1 shows a set of trivial examples indicating how the integration of type and bitstream representation information is accomplished. Consider a simple object called HelloBits with just a single value, represented using 8 bits. Using the MPEG-1/2 methodology, this would be described as shown in Figure 4.1(a). A C++ description of this single-value object would include two methods to read and write its value, and have a form similar to the one shown in Figure 4.1(b). Here getuint() is assumed to be a function that reads bits from the bitstream (here 8) and returns them as an unsigned integer (the most significant bit first); the putuint() function has similar functionality but for output purposes. When HelloBits::get() is called, the bitstream is read and the resultant quantity is placed in the data member Bits. The same description in Flavor is shown in Figure 4.1(c).

click to expand
Figure 4.1: HelloBits. (a) Representation using the MPEG-1/2 methodology. (b) Representation using C++ (A similar construct would also be used for Java). (c) Representation using Flavor.

As we can see, in Flavor the bitstream representation is integrated with the type declaration. The Flavor description should be read as: Bits is an unsigned integer quantity represented using 8 bits in the bitstream. Note that there is no implicit encoding rule as in ASN.1: the rule here is embedded in the type declaration and indicates that, when the system has to parse a HelloBits data type, it will just read the next 8 bits as an unsigned integer and assign them to the variable Bits.

These examples, although trivial, demonstrate the differences between the various approaches. In Figure 4.1(a), we just have a tabulation of the various bitstream entities, grouped into syntactic units. This style is sufficient for straightforward representations, but fails when more complex structures are used (e.g., variable-length codes). In Figure 4.1(b), the syntax is incorporated into hand-written code embedded in get() and put() or an equivalent set of methods. As a result, the syntax becomes an integral part of the encoding/decoding method even though the same encoding/decoding mechanism could be applied to a large variety of similar syntactic constructs. Also, it quickly becomes overly verbose.

Flavor provides a wide range of facilities to define sophisticated bitstreams, including if-else, switch, for and while constructs. In contrast with regular C++ or Java, these are all included in the data declaration part of the class, so they are completely disassociated from code that belongs to class methods. This is in line with the declarative nature of Flavor, where the focus is on defining the structure of the data, not operations on them. As we show later on, a translator can automatically generate C++ and/or Java methods (get() and put()) that can read or write data that complies to the Flavor-described representation.

In the following we describe each of the language features in more detail, emphasizing the differences between C++ and Java. In order to ensure that Flavor semantics are in line with both C++ and Java, whenever there was a conflict a common denominator approach was used.