Alexandros Eleftheriadis and Danny Hong
Department of Electrical Engineering
New York, New York, USA
Flavor, which stands for Formal Language for Audio-Visual Object Representation, originated from the need to simplify and speed up the development of software that processes coded audio-visual or general multimedia-information. This includes encoders and decoders as well as applications that manipulate such information. Examples include editing tools, synthetic content creation tools, multimedia indexing and search engines, etc. Such information is invariably encoded in a highly efficient form to minimize the cost of storage and transmission. This source coding  operation is almost always performed in a bitstream-oriented fashion: the data to be represented is converted to a sequence of binary values of arbitrary (and typically variable) lengths, according to a specified syntax. The syntax itself can have various degrees of sophistication. One of the simplest forms is the GIF87a format , consisting of essentially two headers and blocks of coded image data using the Lempel-Ziv-Welch compression. Much more complex formats include JPEG , MPEG-1 , MPEG-2 [5, 6] and MPEG-4 [7, 8], among others.
General-purpose programming languages such as C++  and Java  do not provide native facilities for coping with such data. Software codec or application developers need to build their own facilities, involving two components. First, they need to develop software that deals with the bitstream-oriented nature of the data, as general-purpose microprocessors are strictly byte-oriented. Second, they need to implement parsing and generation code that complies with the syntax of the format at hand (be it proprietary or standard). These two tasks represent a significant amount of the overall development effort. They also have to be duplicated by everyone who requires access to a particular compressed representation within their application. Furthermore, they can also represent a substantial percentage of the overall execution time of the application.
Flavor addresses these problems in an integrated way. First, it allows the "formal" description of the bitstream syntax. Formal here means that the description is based on a well-defined grammar, and as a result is amenable to software tool manipulation. In the past, such descriptions were using ad hoc conventions involving tabular data or pseudo-code. A second and key aspect of Flavor's architecture is that this description has been designed as an extension of C++ and Java, both heavily used object-oriented languages in multimedia applications development. This ensures seamless integration of Flavor code with both C++ and Java code and the overall architecture of an application.
Flavor was designed as an object-oriented language, anticipating an audio-visual world comprised of audio-visual objects, both synthetic and natural, and combining it with well-established paradigms for software design and implementation. Its object-oriented facilities go beyond the mere duplication of C++ and Java features, and introduce several new concepts that are pertinent for bitstream-based media representation.
In order to validate the expressive power of the language, several existing bitstream formats have already been described in Flavor, including sophisticated structures such as MPEG-2 Systems, Video and Audio. A translator has also been developed for translating Flavor code to C++ or Java code. Since Version 5.0, the translator has been enhanced to support XML features. With the enhanced translator, Flavor code can also be used to generate corresponding XML schema. In addition, the generated C++ or Java code can include the method for producing XML documents that represent the bitstreams described by the Flavor code. Detailed description about the translator and its features are given in Section 4 of this chapter.
Emerging multimedia representation techniques can directly use Flavor to represent the bitstream syntax in their specifications. This will allow immediate use of such specifications in new or existing applications, since the code to access/generate conforming data can be generated directly from the specification and with zero cost. In addition, such code can be automatically optimized; this is particularly important for operations such as Huffman encoding/decoding, a very common tool in media representation.
In the following, we first present a brief background of the language in terms of its history and technical approach. We then describe each of its features, including declarations and constants, expressions and statements, classes, scoping rules and maps. We also describe the translator and its simple run-time API. A brief description of how the translator processes maps to generate entropy encoding/decoding programs is given as well. Then, the XML features offered by the translator are explained. Finally, we conclude with an overview of the benefits of using the Flavor approach for media representation. More detailed information and publicly available software can be found in the Flavor web site at: http://flavor.sourceforge.net.
Note that Flavor is an open source project under Flavor Artistic License as defined in the Flavor web site. As a result, the source code for the translator and the run-time library is available as part of the Flavor package. The complete package can be downloaded from http://www.sourceforge.net/projects/flavor.