4. The Flavor Translator

Designing a language like Flavor would be an interesting but academic exercise, unless it was accompanied by software that can put its power into full use. We have developed a translator that evolved concurrently with the design of the language. When the language specification became stable, the translator was completely rewritten. The most recent release is publicly available for downloading at http://www.sourceforge.net/projects/flavor.

4.1 Run-Time API

The translator reads a Flavor source file (.f1) and, depending on the language selection flag of the code generator, it creates a pair of .h and .cpp files (for C++) or a set of .java files (for Java). In the case of C++, the .h file contains the declarations of all Flavor classes as regular C++ classes and the .cpp file contains the implementations of the corresponding class methods (put() and get()). In the case of Java, each .java file contains the declaration and implementation of a single Flavor class. In both cases, the get() method is responsible for reading a bitstream and loading the class variables with their appropriate values, while the put() method does the reverse. All the members of the classes are declared public, and this allows direct access to desired fields in the bitstream.

The translator makes minimal assumptions about the operating environment for the generated code. For example, it is impossible to anticipate all possible I/O structures that might be needed by applications (network-based, multi-threaded, multiple buffers, etc.). Attempting to provide a universal solution would be futile. Thus, instead of having the translator directly output the required code for bitstream I/O, error reporting, tracing, etc., a run-time library is provided. With this separation, programmers have the flexibility of replacing parts of, or the entire library with their own code. The only requirement is that this custom code provides an identical interface to the one needed by the translator. This interface is defined in a pure virtual class called IBitstream. Additionally, as the source code for the library is included in its entirety, customization can be performed quite easily. The Flavor package also provides information on how to rebuild the library, if needed. Custom library can be built simply by deriving from the IBitstream class. This will ensure compatibility with the translator.

The run-time library includes the Bitstream class that is derived from the IBitstream interface, and provides basic bitstream I/O facilities in terms of reading or writing bits. A Bitstream reference is passed as an argument to the get() and put() methods.

If parameter types are used in a class, then they are also required arguments in the get() and put() methods as well. The translator also requires that a function is available to receive calls when expected values are not available or VLC lookups fail. The function name can be selected by the user; a default implementation (flerror) is included in the run-time library.

For efficiency reasons, Flavor arrays are converted to fixed size arrays in the translated code. This is necessary in order to allow developers to access Flavor arrays without needing special techniques. Whenever possible, the translator automatically detects and sets the maximum array size; it can also be set by the user using a command-line option. Finally, the run-time library (and the translator) only allows parse sizes of up to the native integer size of the host processor (except for double values). This enables fast implementation of bitstream I/O operations.

For parsing operations, the only task required by the programmer is to declare an object of the class type at hand, and then call its get() method with an appropriate bitstream. While the same is also true for put() operation, the application developer must also load all class member variables with their appropriate values before the call is made.

4.2 Include and Import Directives

In order to simplify the source code organization, Flavor supports % include and % import directives. These are the mechanisms to combine several different source code files into one entity, or to share a given data structure definition across different projects.

4.2.1 Include Directive

The statement - %include "file.f1" -will include the specified .fl file in the current position and will flag all of its content so that no code is generated. Figure 4.13(a) displays a .fl file (other.fl) that is included by another .fl file (main.fl). The other.fl file contains the definition of the constant a. The inclusion makes the declaration of the a variable available to the main.fl file. In terms of the generated output, Figure 4.13(b) outlines the placement of information in different files. In the figure, we see that the main and included files each keep their corresponding implementations. The generated C++ code maintains this partitioning, and makes sure that the main file includes the C++ header file of the included Flavor file.

       // In the file, other.fl       const int a = 4;       // In the file, main.fl       %include "other.fl"       class Test {         int(a) t; // The variable 'a' is included from the other.fl file       }

(a)

 // In the file, other.h   // In the file, other.cpp   // In the file, main.h extern const int a;       #include "other.h"          #include "other.h"                           const int a = 4;            …

(b)

Figure 4.13: The %include directive. (a) The other.fl file is included by the main.fl file. (b) The other.h and other.cpp files are generated from the other.fl file whereas the main.h file is generated from the main.fl file.

The %include directive is useful when data structures need to be shared across modules or projects. It is similar in spirit to the use of the C/C++ preprocessor #include statement in the sense that it is used to make general information available at several different places in a program. Its operation, however, is different as Flavor's %include statement does not involve code generation for the included code. In C/C++, #include is equivalent to copying the included file in the position of the #include statement. This behavior is offered in Flavor by the %import directive.

Similarly, when generating Java code, only the .java files corresponding to the currently processed Flavor file are generated. The data in the included files are allowed to be used, but they are not generated.

4.2.2 Import Directive

The %import directive behaves similarly to the %include directive, except that full code is generated for the imported file by the translator, and no C++ #include statement is used. This behavior is identical to how a C++ preprocessor #include statement would behave in Flavor.

Let's consider the example of the previous section, this time with an %import directive rather than an %include one as shown in Figure 4.14(a). As can be seen from Figure 4.14(b), the generated code includes the C++ code corresponding to the imported .fl file. Therefore, using the %import directive is exactly the same as just copying the code in the imported .fl file and pasting it in the same location as the %import statement is specified. The translator generates the Java code in the same way.

 %import "other.fl" class Test {   int(a) t; // The variable 'a' is imported from the other.fl file }

(a)

        // In the file, main.h        // In the file, main.cpp        extern const int a;           const int a = 4;        …                             …

(b)

Figure 4.14: The %import directive. (a) The main.fl file using the %import directive. (b) The main.h and main.cpp files generated from the main.fl file defined in (a).

Note that the Java import statement behaves more like Flavor's %include statement, in that no code generation takes place for the imported (included) code.

4.3 Pragma Statements

Pragma statements are used as a mechanism for setting translator options from inside a Flavor source file. This allows modification of translation parameters (set by the command-line options) without modifying the makefile that builds the user's program, but more importantly, it allows very fine control on which translation options are applied to each class, or even variable. Almost all command-line options have pragma equivalents. The ones excluded were not considered useful for specification within a source file.

Pragma statements are introduced with the %pragma directive. It can appear wherever a statement or declaration can. It can contain one or more settings, separated by commas, and it cannot span more than one line. After a setting is provided, it will be used for the remainder of the Flavor file, unless overridden by a different pragma setting. In other words, pragma statements do not follow the scope of Flavor code. A pragma that is included in a class will affect not only the class where it is contained but all classes declared after it. An example is provided in Figure 4.15.

 // Activate both put and get, generate tracing code, and set array size to 128 %pragma put, get, trace, array=128 class Example {   %pragma noput                // No put() method needed   unsigned int(10) length;   %pragma array=1024           // Switch array size to 1024   char(3) data[length];   %pragma array=128            // Switch array size back to 128   %pragma trace="Tracer.trace" // Use custom tracer } // The above settings are still active here!

Figure 4.15: Some examples of using pragma statements to set the translator options at specific locations.

In this example, we start off setting the generation of both get () and put () methods, enabling tracing and setting the maximum array size to 128 elements. Inside the Example class, we disable the put() method output. This class reads a chunk of data, which is preceded by its size (length, a 10-bit quantity). This means that the largest possible buffer size is 1024 elements. Hence for the data array that immediately follows, we set the array size to 1024, and then switch it back to the default of 128. Finally, at the end of the class we select a different tracing function name; this function is really a method of a class, but this is irrelevant for the translator. Since this directive is used when the get() method code is produced, it will affect the entire class despite the fact that it is declared at its end.

Note that these pragma settings remain in effect even after the end of the Example class.

4.4 Verbatim Code

In order to further facilitate integration of Flavor code with C++/Java user code, the translator supports the notion of verbatim code. Using special delimiters, code segments can be inserted in the Flavor source code, and copied verbatim at the correct places in the generated C++/Java file. This allows, for example, the declaration of constructors/destructors, user-specified methods, pointer member variables for C++, etc. Such verbatim code can appear wherever a Flavor statement or declaration is allowed.

The delimiters %{ and %} can be used to introduce code that should go to the class declaration itself (or the global scope). The delimiters %p{ and %p}, and %g{ and %g} can be used to place code at exactly the same position they appear in the put() and get() methods, respectively. Finally, the delimiters %*{ and %*} can be used to place code in both put() and get() methods. To place code specific to C++ or Java, .c or .j can be placed before the braces in the delimiters, respectively. For example, a verbatim code segment to be placed in the get() method of the Java code will be delimited with %g.j{ and %g.j}.

The Flavor package includes several samples on how to integrate user code with Flavor-generated code, including a simple GIF parser. Figure 4.16 shows a simple example that reads the header of a GIF87a file and prints its values. The print statement which prints the values of the various elements is inserted as verbatim code in the syntax (within %g{ and %g} markers, since the code should go in the get() method). The implementation of the print method for C++ code is declared within %.c{ and %.c}, and for Java, the corresponding implementation is defined within %.j{ and %.j}. The complete sample code can be found in the Flavor package.

 class GIF87a {   char(8) GIFsignature[6] = "GIF87a" // GIF signature   %g{ print(); %g}   ScreenDescriptor sd;               // A screen descriptor   // One or more image descriptors   do {     unsigned int(8) end;     if (end == ',') {      // We found an image descriptor       ImageDescriptor id;     }     if (end == '!') {      // We found an extension block       ExtensionBlock eb;     }     // Everything else is ignored   } while (end != ';');    // ';' is the end-of-data marker   %.c{   void print () {…}   %.c}   %.j{   void print () {…}   %.j }

Figure 4.16: A simple Flavor example— the GIF87a header. The usage of verbatim code is illustrated.

4.5 Tracing Code Generation

We also included the option to generate bitstream tracing code within the get ( ) method. This allows one to very quickly examine the contents of a bitstream for development and/or debugging purposes by creating a dump of the bitstream's content. With this option, and given the syntax of a bitstream described in Flavor, the translator will automatically generate a complete C++/Java program that can verify if a given bitstream complies with that syntax or not. This can be extremely useful for codec development as well as compliance testing.

4.6 Map Processing

Map processing is one of the most useful features of Flavor, as hand-coding VLC tables is tedious and error prone. Especially during the development phase of a representation format, when such tables are still under design, full optimization within each design iteration is usually not performed. By using the translator, such optimization is performed at zero cost. Also, note that maps can be used for fixed-length code mappings just by making all codewords have the same length. As a result, one can very easily switch between fixed and variable-length mappings when designing a new representation format.

When processing a map, the translator first checks that it is uniquely decodable, i.e., no codeword is the prefix of another. It then constructs a class declaration for that map, which exposes two methods: getvlc() and putvlc(). These take as arguments a bitstream reference as well as a pointer to the return type of the map. The getvlc() method is responsible for decoding a map entry and returning the decoded value, while the putvlc() method is responsible for the output of the correct codeword. Note that the defined class does not perform any direct bitstream I/O itself, but uses the services of the Bitstream class instead. This ensures that a user-supplied bitstream I/O library will be seamlessly used for map processing.

According to Fogg's survey on software and hardware VLC architectures [17], fast software decoding methods usually exploit a variable-size look-ahead window (multi-bit lookup) with lookup tables. For optimization, the look-ahead size and corresponding tables can be customized for position dependency in the bitstream. For example, for MPEG video, the look-ahead can be selected based on the picture type (I, P, or B).

One of the fastest decoding methods is comprised of one huge lookup table where every codeword represents an index of the table pointing to the corresponding value. However, this costs too much memory. On the other end of the spectrum, one of the most memory efficient algorithms would be of the form of a binary tree. The tree is traversed one bit at a time, and at each stage one examines if a leaf node is reached; if not, the left or right branch is taken for the input bit value of 0 or 1, respectively. Though efficient in memory, this algorithm is extremely slow, requiring N (the bit length of the longest codeword) stages of bitstream input and lookup.

In [16] we adopted a hybrid approach that maintains the space efficiency of binary tree decoding, and most of the speed associated with lookup tables. In particular, instead of using lookup tables, we use hierarchical, nested switch statements. Each time the read-ahead size is determined by the maximum of a fixed step size and the size of the next shortest code. The fixed step size is used to avoid degeneration of the algorithm into binary tree decoding. The benefit of this approach is that only complete matches require case statements, while all partial matches can be grouped into a single default statement (that, in turn, introduces another switch statement).

With the above-mentioned approach, the space requirement consists of storage of the case values and the comparison code generated by the compiler (this code consists of just 2 instructions on typical CISC systems, e.g., a Pentium). While slightly larger than a simple binary tree decoder, this overhead still grows linearly with the number of code entries (rather than exponentially with their length). This is further facilitated by the selection of the step size. When the incremental code size is small, multiple case statements may be assigned to the same codeword, thus increasing the space requirements.

In [16] we compared the performance of various techniques, including binary tree parsing, fixed step full lookups with different step sizes and our hybrid switch statement approach. In terms of time, our technique is faster than a hierarchical full-lookup approach with identical step sizes. This is because switching consumes little time compared to fixed-step's function lookups. Furthermore, it is optimized by ordering the case statements in terms of the length of their codeword. As shorter lengths correspond to higher probabilities, this minimizes the average number of comparisons per codeword.

With Flavor, developers can be assured of extremely fast decoding with minimal memory requirement due to the optimized code generation. In addition, the development effort and time in creating software to process VLCs is nearly eliminated.

4.7 XFlavor

Since Version 5.0, the Flavor translator is also referred to as XFlavor, an extended version that provides XML features. XFlavor has the capability to transform a Flavor description into an XML schema, and it can also produce code for generating XML documents corresponding to the bitstreams described by Flavor. Additionally, as a part of XFlavor, a compression tool (Bitstream Generator) for converting the XML representation of the data back into its original bitstream format is provided. Thus, XFlavor consists of the translator and Bitstream Generator.

The purpose of converting multimedia data in binary format into an equivalent XML document [18] is for easier and more flexible manipulation of the data. In the XML document, the bitstream layer is abstracted from applications and the semantic values of the data (e.g., width and height of an image) are directly available. In the bitstream format, such values must be extracted via bit string manipulations, according to the specified syntax.

Another advantage of using XML representation of data is that software tools are provided with generic access to multimedia data (usually with a generic XML parser). For example, a search engine that uses DC values to search for images/videos can work with all of JPEG, MPEG-1, 2, 4, and H.26x formats if XML is used to represent the data. However, if each individual binary format is used, then different search engines must be created for each format, as a different syntax is used to represent DC values by the different formats. This requires a different parser for each bitstream.

Additionally, as the name suggests, XML documents are extensible. With XML representation of data, extra information can be added to (or deleted from) the given document and a software tool can use both the new and old documents without changing the parser. On the contrary, with a GIF image, for example, adding extra information anywhere in the bitstream (other than at the end of the bitstream) renders the bitstream useless. Additionally, the extra information cannot be easily distinguished from the bitstream.

With the XML document generating code (the translator automatically generates a putxml() method for creating XML documents), a bitstream representing multimedia data can be read in and a corresponding XML document can be automatically generated. Unfortunately, XML documents are too verbose and unfavorable for storage or network transmission. For this reason, XFlavor also provides the feature for converting the documents back into their corresponding binary format. When processing a bitstream, it can be converted to an XML file (in whole or in part) or the XML representation can be stored in memory. Then, after processing the data, it can be converted back into its original bitstream format.

Also, in order to make the XML representation of the multimedia data strictly conforming to the original bitstream specification, the corresponding XML schema can be used. The schema [19] defines the syntax of the data and this is used as the guideline for determining the "validity" of XML represented data. Also, with the schema generated from the bitstream specification (Flavor description), the conforming XML documents can be exactly converted back into the original bitstream format. The converted data, then, can be processed in the bitstream domain (which yields faster processing) by the bitstream specification compliant decoders.

Figure 4.17 illustrates the functions and some of the possible applications of XFlavor. Starting with a Flavor description of a multimedia bitstream - Flavor File A - XFlavor can generate an XML schema or XML document generating code. Then, applications can use the generated schema to create new multimedia content in the XML form (e.g., Multimedia App A). Additionally, the bitstream representation of multimedia content can be converted into an XML document for more flexible and extensible processing of the data. Once the data is in the XML form, different applications can easily manipulate it (e.g., Multimedia App C and D) or XML tools can be used to manipulate the data (e.g., XSLT Processor). Also, for space and time critical applications, the XML representation of the data can be converted back into the original bitstream representation (e.g., Multimedia App B).

click to expand
Figure 4.17: An illustration of XML features offered by XFlavor that are used for different applications.

In the following, we describe some of the benefits obtained by using the XML features.

4.7.1 Universal Parser

XML representation allows easy access to different elements. It allows an application to modify and add/delete elements of an XML document, and yet allows different applications to still work with the modified document. In other words, even if the structure of the document is changed, the correct semantics of the original elements can still be obtained. Any application can add proprietary information into the document without ever worrying about breaking the semantics of the original data, and thus preserving the interoperability of the data. This is the benefit of a self-describing format.

Additionally, XML based applications offer benefits such as providing support for different bitstream syntax without reprogramming/recompiling. An application built for Syntax A can still work with a document with Syntax B as long as the required elements are still available in the document and the tags remain the same.

4.7.2 Generic Content Manipulation

With XML representation, all the XML related technologies and software tools can be directly used. One very important XML technology is XSLT [20]. XSLT is a language defined using XML for transforming XML documents. With XML alone, interoperability is only obtained among the applications that understand a certain set of predefined tags. However, with XSLT, any application can be enabled to use any XML document as long as the actual data in the document is useful to the application. This "universal interoperability" is achieved by transforming XML documents into different structures usable for different applications.

Using such technology, a transcoder can be easily created by simply declaring a set of rules to transform the source elements. The style sheet (containing XSLT rules) along with the source document can be fed into an XSLT processor to generate a new document with desired structure and information. A set of style sheets can also be used to manipulate the elements of a given document. Similarly, simple text manipulating tools such as Perl can also be used to easily manipulate the data in the XML form.

Such an idea has recently been proposed in MPEG-21 for easy manipulation of scalable content with the introduction of a bitstream syntax description language - BSDL. BSDL is derived from XML Schema and it is used to describe (in high-level) scalable content. Generic software tools are also provided so that, using the BSDL description, scalable content can be converted back and forth between binary and XML representations. Once the content is in the XML form, generic software can be used to parse and manipulate the content. The goal is, depending on the conditions related to the user, terminal, environment, etc., to easily provide modified content that satisfies the conditions.

Though, the software tools in BSDL seem to provide the same functionalities as that of XFlavor, there are some differences between them. Using BSDL, each one of the supported bitstream syntaxes must be hand-described. Due to the nature of the language, the description can get quite verbose and the process of the description can be error-prone and tedious. As a solution, XFlavor can be used to automatically generate XML schema (BSDL description) corresponding to the given Flavor bitstream description.

Additionally, BSDL uses generic software to convert bitstream descriptions into XML documents. The software is very slow, and for real-time applications, there is a need to provide content-specific (syntax-specific) software tools so that fast conversion can take place. For each bitstream syntax description, XFlavor provides code to generate corresponding XML document much faster than the generic software tool.

Finally, the goal of BSDL is to provide high-level description of scalable content and, as described in Section 2, it lacks certain facilities to describe the whole bitstream syntax in a concise manner.

4.7.3 Compression

With the very high usage of XML applications, it is very important to come up with an efficient representation of XML documents. Because an XML document is text based, it is not efficient in a compression sense, and thus, it is not preferable for storage or transmission. Additionally, textual representation of multimedia data requires more processing time than binary representation. To fix these problems, binary representation of the data can be created for the given XML representation.

MPEG-7 [21] has already standardized binary encoding of its XML documents (BiM [22]). For wireless applications, where resources are very scarce, a binary version of XML is being deployed (WBXML [23]). Alternatively, there are a number of XML compression tools such as XMill [24] and XMLZip [25]. Though these can be applied to the XML representation of multimedia data, they do not yield as efficient result as the original compressed bitstream.

XMill is a compression tool based on a novel compression algorithm derived from the "grouping strategy" [24]. This algorithm sometimes yields better compression result than conventional compression (gzip) on the original data; however, this is usually the case for text data. For multimedia data originally represented by a compressed bitstream, it is very difficult to reduce the corresponding XML document to a size smaller than the original data. XMLZip is also a compression tool developed by XML Solutions (now a part of Vitria Technology) for compressing XML documents while maintaining the XML structure. As expected, it yields worse compression than XMill, but it offers DOM API [26] in the compressed domain, which ultimately results in faster XML parsing and processing. BiM is a generic XML tool used for MPEG-7 description encoding/decoding. Similar to XMLZip, BiM provides XML APIs for legacy systems. It also provides additional features such as streaming of XML documents and fast skipping of elements.

Table 4.1 lists the compression effectiveness of XMill, XMLZip, BiM and XFlavor (Bitstream Generator). Two GIF87a images (File 1 and File 2) are first converted into corresponding XML documents (using XFlavor), and they are compressed using the XML compression tools. Millau [27] was also examined; however, the encoder and decoder were not available for testing at the time of this writing. As shown in the table, all three compressors (XMill, XMLZip and BiM) have CE greater than 1. XMill produces the best result among the three because its sole purpose is to compress the data, whereas XMLZip and BiM produce much bigger files because they add some useful features in the compressed domain. With XMill, the compressed XML data has to be decompressed in order to process the data. With XFlavor, however, the XML data can be converted to the original bitstream syntax and the resulting data can be processed using existing decoders that conform to the given bitstream specification. Additionally, XFlavor converts the data in XML format into the original (compressed) bitstream, and the size of the bitstream is always smaller than the same XML data compressed using XMill.

Table 4.1: Compression effectiveness (CE ) for different XML compressors. CE = Compressed XML file / Original (bitstream) file.
	File 1 (in Bytes)	CE	File 2 (in Bytes)	CE
*Original File Size*	*10,048*		*278,583*
*XML File Size*	*305,634*		*8,200,278*
XMill (gzip, CF = 6)	14,348	1.43	362,173	1.30
XMill (gzip, CF = 9)	14,311	1.42	357,335	1.28
XMill (bzip)	11,120	1.11	285,088	1.02
XMLZip	18,561	1.85	471,203	1.69
BiM	45,264	4.50	-	-
XFlavor	10,048	1.00	278,583	1.00

In summary, if storage or transmission is the main concern, then the original bitstream should be used. Upon the time of processing, the bitstream can be converted to an XML file. Additionally, it is beneficial to maintain the bitstream representation of the data allowing fast processing for the applications like Multimedia App B in Figure 4.17.