All traditional C++ and Java literals are supported by Flavor. This includes integers, floating-point numbers and character constants (e.g., 'a'). Strings are also supported by Flavor. They are converted to arrays of characters, with or without a trailing '\0' (null character).
Additionally, Flavor defines a special binary number notation using the prefix 0b. Numbers represented with such notation are called binary literals (or bit strings) and, in addition to the actual value, also convey their length. For example, one can write 0b011 to denote the number 3 represented using 3 bits. For readability, a bit string can include periods every four digits, e.g., 0b0010.1100.001. Hexadecimal or octal constants used in the context of a bit string also convey their length in addition to their value. Whenever the length of a binary literal is irrelevant, it is treated as a regular integer literal.
Both multi-line /**/ and single-line // comments are allowed. The multi-line comment delimiters cannot be nested.
Variable names follow the C++ and Java conventions (e.g., variable names cannot start with a number). The keywords that are used in C++ and Java are considered reserved in Flavor.
Flavor supports the common subset of C++ and Java built-in or fundamental types. This includes char, int, float and double along with all appropriate modifiers (short, long, signed and unsigned). Additionally, Flavor defines a new type called bit and a set of new modifiers, big and little. The type bit is used to accommodate bit string variables and the new modifiers are used to indicate the endianess of bytes. The big modifier is used to represent the numbers using big-endian byte ordering (the most significant byte first) and the little modifier is used for the numbers represented using the little-endian method. By default, big-endian byte ordering is assumed. Note that endianess here refers to the bitstream representation, not the processor on which Flavor software may be running. The latter is irrelevant for the bitstream description.
Flavor also allows declaration of new types in the form of classes (refer to Section 3.3 for more information regarding classes). However, Flavor does not support pointers, references, casts or C++ operators related to pointers. Structures or enumerations are not supported either, since they are not supported by Java.
Regular variable declarations can be used in Flavor in the same way as in C++ and Java. As Flavor follows a declarative approach, constant variable declarations with specified values are allowed everywhere (there is no constructor to set the initial values). This means that the declaration 'const int a = 1;' is valid anywhere (not just in global scope). The two major differences are the declaration of parsable variables and arrays.
Parsable variables are the core of Flavor's design; it is the proper definition of these variables that defines the bitstream syntax. Parsable variables include a parse length specification immediately after their type declaration, as shown in Figure 4.2. In the figure, the blength argument can be an integer constant, a non-constant variable of type compatible to int or a map (discussed later on) with the same type as the variable. This means that the parse length of a variable can be controlled by another variable. For example, the parsable variable declaration in Figure 4.3(a) indicates that the variable a has the parse length of 3 bits. In addition to the parse length specification, parsable variables can also have the aligned modifier. This signifies that the variable begins at the next integer multiple boundary of the length argument - alength - specified within the alignment expression. If this length is omitted, an alignment size of 8 is assumed (byte boundary). Thus, the variable a is byte-aligned and for parsing, any intermediate bits are ignored, while for output bitstream generation the bitstream is padded with zeroes.
|  | 
[aligned(alength)] type(blength) variable [=value];
|  | 
|  | 
| aligned int(3) a; | aligned int(3)* a; | aligned int(3) a=2; | 
| (a) | (b) | (c) | 
|  | 
As we will see later on, parsable variables cannot be assigned to. This ensures that the syntax is preserved regardless if we are performing an input or output operation. However, parsable variables can be redeclared, as long as their type remains the same, only the parse size is changed, and the original declaration was not as a const. This allows one to select the parse size depending on the context (see Expressions and Statements, Section 3.2). On top of this, they obey special scoping rules as described in Section 3.4.
In general, the parse size expression must be a non-negative value. The special value 0 can be used when, depending on the bitstream context, a variable is not present in the bitstream but obtains a default value. In this case, no bits will be parsed or generated; however, the semantics of the declaration will be preserved. The variables of type float, double and long double are only allowed to have a parse size equal to the fixed size that their standard representation requires (32 and 64 bits).
In several instances, it is desirable to examine the immediately following bits in the bitstream without actually removing the bits from the input stream. To support this behavior, a '*' character can be placed after the parse size parentheses. Note that for bitstream output purposes, this has no effect. An example of a declaration of a variable for look-ahead parsing is given in Figure 4.3(b).
Very often, certain parsable variables in the syntax have to have specific values (markers, start codes, reserved bits, etc.). These are specified as initialization values for parsable variables. Figure 4.3(c) shows an example. The example is interpreted as: a is an integer represented with 3 bits, and must have the value 2. The keyword const may be prepended in the declaration, to indicate that the parsable variable will have this constant value and, as a result, cannot be redeclared.
As both parse size and initial value can be arbitrary expressions, we should note that the order of evaluation is parse expression first, followed by the initializing expression.
Arrays have special behavior in Flavor, due to its declarative nature but also due to the desire for very dynamic type declarations. For example, we want to be able to declare a parsable array with different array sizes depending on the context. In addition, we may need to load the elements of an array one at a time (this is needed when the retrieved value indicates indirectly if further elements of the array should be parsed). These concerns are only relevant for parsable variables. The array size, then, does not have to be a constant expression (as in C++ and Java), but it can be a variable as well. The example in Figure 4.4(a) is allowed in Flavor.
|  | 
| 
 
 int a = 2; int(2) A[a++]; | 
 
 int A[3] = 5; | 
 
 int a = 1; int(a++) A[a++] = a++; | 
 
 int(2) A[[3]] = 1; int(4) B[[2]][3]; | 
| (a) | (b) | (c) | (d) | 
|  | 
An interesting question is how to handle initialization of arrays, or parsable arrays with expected values. In addition to the usual brace expression initialization (e.g., 'int A [2] = {1, 2};'), Flavor also provides a mechanism that involves the specification of a single expression as the initializer as shown in Figure 4.4(b). This means that all elements of A will be initialized with the value 5. In order to provide more powerful semantics to array initialization, Flavor considers the parse size and initializer expressions as executed per each element of the array. The array size expression, however, is only executed once, before the parse size expression or the initializer expression.
Let's look at a more complicated example in Figure 4.4(c). Here A is declared as an array of 2 integers. The first one is parsed with 3 bits and is expected to have the value 4, while the second is parsed with 5 bits and is expected to have the value 6. After the declaration, a is left with the value 7. This probably represents the largest deviation of Flavor's design from C++ and Java declarations. On the other hand it does provide significant flexibility in constructing sophisticated declarations in a very compact form, and it is also in line with the dynamic nature of variable declarations that Flavor provides.
An additional refinement of array declaration is partial arrays. These are declarations of parsable arrays in which only a subset of the array needs to be declared (or, equivalently, parsed from or written to a bitstream). Flavor introduces a double bracket notation for this purpose. In Figure 4.4(d), an example is given to demonstrate its use. In the first line, we are declaring the 4-th element of A (array indices start from 0). The array size is unknown at this point, but of course it will be considered at least 4. In the second line, we are declaring a two-dimensional array, and in particular only its third row (assuming the first index corresponds to a row). The array indices can, of course, be expressions themselves. Partial arrays can only appear on the left-hand side of declaration and are not allowed in expressions.
Flavor supports all of the C++ and Java arithmetic, logical and assignment operators. However, parsable variables cannot be used as lvalues. This ensures that they always represent the bitstream's content, and allow consistent operations for the translator-generated get() and put() methods that read and write, respectively, data according to the specified form. Refer to Section 4.1 for detailed information about these methods.
Flavor also supports all the familiar flow control statements: if-else, do-while, while, for and switch. In contrast to C++ and Java, variable declarations are not allowed within the arguments of these statements (i.e., 'for (int i=0; ; );' is not allowed). This is because in C++ the scope of this variable will be the enclosing one, while in Java it will be the enclosed one. To avoid confusion, we opted for the exclusion of both alternatives at the expense of a slightly more verbose notation. Scoping rules are discussed in detail in Section 3.4. Similarly, Java only allows Boolean expressions as part of the flow control statements, and statements like 'if (1) { ... }' are not allowed in Java. Thus, only the flow control statements with Boolean expressions are valid in Flavor.
Figure 4.5 shows an example of the use of these flow control statements. The variable b is declared with a parse size of 16 if a is equal to 1, and with a parse size of 24 otherwise. Observe that this construct would not be meaningful in C++ or Java as the two declarations would be considered as being in separate scopes. This is the reason why parsable variables need to obey slightly different scoping rules than regular variables. The way to approach this to avoid confusion is to consider that Flavor is designed so that these parsable variables are properly defined at the right time and position. All the rest of the code is there to ensure that this is the case. We can consider the parsable variable declarations as "actions" that our system will perform at the specified times. This difference, then, in the scoping rules becomes a very natural one.
|  | 
 if (a == 1) {   int(16) b; // b is a 16 bit integer } else {   int(24) b; // b is a 24 bit integer }   |  | 
Flavor uses the notion of classes in exactly the same way as C++ and Java do. It is the fundamental structure in which object data are organized. Keeping in line with the support of both C++ and Java-style programming, classes in Flavor cannot be nested, and only single inheritance is supported. In addition, due to the declarative nature of Flavor, methods are not allowed (this includes constructors and destructors as well).
Figure 4.6(a) shows an example of a simple class declaration with just two parsable member variables. The trailing ';' character is optional accommodating both C++ and Java-style class declarations. This class defines objects that contain two parsable variables. They will be present in the bitstream in the same order they are declared. After this class is defined, we can declare objects of this type as shown in Figure 4.6(b).
|  | 
| 
 
  class SimpleClass {   int(3) a;   unsigned int(4) b; }; // The trailling ';' is optional  | 
 
  class SimpleClass(int i[2]) {   int(3) a = i[0];   unsigned int(4) b = i[1]; };  | 
| (a) | (c) | 
| 
 
 SimpleClass a; | 
 
 int(2) v[2]; SimpleClass a(v); | 
| (b) | (d) | 
|  | 
A class is considered parsable if it contains at least one variable that is parsable. The aligned modifier can prepend declaration of parsable class variables in the same way as parsable variables.
Class member variables in Flavor do not require access modifiers (public, protected, private). In essence, all such variables are considered public.
As Flavor classes cannot have constructors, it is necessary to have a mechanism to pass external information to a class. This is accomplished using parameter types. These act the same way as formal arguments in function or method declarations do. They are placed in parentheses after the name of the class. Figure 4.6(c) gives an example of a simple class declaration with parameter types. When declaring variables of parameter type classes, it is required that the actual arguments are provided in place of the formal ones as displayed in Figure 4.6(d).
Of course the types of the formal and actual parameters must match. For arrays, only their dimensions are relevant; their actual sizes are not significant as they can be dynamically varying. Note that class types are allowed in parameter declarations as well.
As we mentioned earlier, Flavor supports single inheritance so that compatibility with Java is maintained. Although Java can "simulate" multiple inheritances through the use of interfaces, Flavor has no such facility (it would be meaningless since methods do not exist in Flavor). However, for media representation purposes, we have not found any instance where multiple inheritances would be required, or even be desirable. It is interesting to note that all existing representation standards today are not truly object-based. The only exception, to our knowledge, is the MPEG-4 specification that explicitly addresses the representation of audio-visual objects. It is, of course, possible to describe existing structures in an object-oriented way but it does not truly map one-to-one with the notion of objects. For example, the MPEG-2 Video slices can be considered as separate objects of the same type, but of course their semantic interpretation (horizontal stripes of macroblocks) is not very useful. Note that containment formats like MP4 and QuickTime are more object-oriented, as they are composed of object-oriented structures called 'atoms.'
Derivation in C++ and Java is accomplished using a different syntax (extends versus ':'). Here we opted for the Java notation (also ':' is used for object identifier declarations as explained below). Unfortunately, it was not possible to satisfy both.
In Figure 4.7(a) we show a simple example of a derived class declaration. Derivation from a bitstream representation point of view means that B is an A with some additional information. In other words, the behavior would be almost identical if we just copied the statements between the braces in the declaration of A into the beginning of B. We say "almost" here because scoping rules of variable declarations also come into play, as discussed in Section 3.4.
|  | 
| 
 
  class A {   int(2) a; } class B extends A {   int(3) b; }  | 
 
  class A:int(1) id=0 {   int(2) a; } class B extends A:int(1) id=1 {   int(3) b; }  | 
| (a) | (b) | 
|  | 
Note that if a class is derived from a parsable class, it is also considered parsable.
The concept of inheritance in object-oriented programming derives its power from its capability to implement polymorphism. In other words, the capability to use a derived object in place of the base class is expected. Although the mere structural organization is useful as well, it could be accomplished equally well with containment (a variable of type A is the first member of B).
Polymorphism in traditional programming languages is made possible via vtable structures, which allow the resolution of operations during run-time. Such behavior is not pertinent for Flavor, as methods are not allowed.
A more fundamental issue, however, is that Flavor describes the bitstream syntax: the information with which the system can detect which object to select must be present in the bitstream. As a result, traditional inheritance as defined in the previous section does not allow the representation of polymorphic objects. Considering Figure 4.7(a), there is no way to figure out by reading a bitstream if we should read an object of type A or type B.
Flavor solves this problem by introducing the concept of object identifiers or IDs. The concept is rather simple: in order to detect which object we should parse/generate, there must be a parsable variable that will identify it. This variable must have a different expected value for any class derived from the originating base class, so that object resolution can be uniquely performed in a well-defined way (this can be checked by the translator). As a result, object ID values must be constant expressions and they are always considered constant, i.e., they cannot be redeclared within the class.
In order to signify the importance of the ID variables, they are declared immediately after the class name (including any derivation declaration) and before the class body. They are separated from the class name declaration using a colon (':'). We could rewrite the example of Figure 4.7(a) with IDs as shown in Figure 4.7(b). Upon reading the bitstream, if the next 1 bit has the value 0 an object of type A will be parsed; if the value is 1 then an object of type B will be parsed. For output purposes, and as will be discussed in Section 4, it is up to the user to set up the right object type in preparation for output.
The name and the type of the ID variable are irrelevant, and can be anything that the user chooses. It cannot, however, be an array or a class variable (only built-in types are allowed). Also, the name, type and parse size must be identical between the base and derived classes. However, object identifiers are not required for all derived classes of a base class that has a declared ID. In this case, only the derived classes with defined IDs can be used wherever the base class can appear. This type of polymorphism is already used in the MPEG-4 Systems specification, and in particular the Binary Format for Scenes (BIFS) [15]. This is a VRML-derived set of nodes that represent objects and operations on them, thus forming a hierarchical description of a scene.
The ID of a class is also possible to have a range of possible values which is specified as start_id .. end_id, inclusive of both bounds. See Figure 4.8 for an example.
|  | 
 class slice:aligned bit(32) slice_start_code=0x00000101 .. 0x000001AF {   - }   |  | 
The scoping rules that Flavor uses are identical with C++ and Java with the exception of parsable variables. As in C++ and Java, a new scope is introduced with curly braces ({}). Since Flavor does not have functions or methods, a scope can either be the global one or a scope within a class declaration. Note that the global scope cannot contain any parsable variable, since it does not belong to any object. As a result, global variables can only be constants.
Within a class, all parsable variables are considered as class member variables, regardless of the scope they are encountered in. This is essential in order to allow conditional declarations of variables, which will almost always require that the actual declarations occur within compound statements (see Figure 4.5). Non-parsable variables that occur in the top-most class scope are also considered class member variables. The rest live within their individual scopes.
This distinction is important in order to understand which variables are accessible to a class variable that is contained in another class. The issues are illustrated in Figure 4.9. Looking at the class A, the initial declaration of i occurs in the top-most class scope; as a result i is a class member. The variable a is declared as a parsable variable, and hence it is automatically a class member variable. The declaration of j occurs in the scope enclosed by the if statement; as this is not the top-level scope, j is not a class member. The following declaration of i is acceptable; the original one is hidden within that scope. Finally, the declaration of the variable a as a non-parsable would hide the parsable version. As parsable variables do not obey scoping rules, this is not allowed (hiding parsable variables of a base class, however, is allowed). Looking now at the declaration of the class B, which contains a variable of type A, it becomes clear which variables are available as class members.
|  | 
| 
 
  class A {   int i = 1;   int(2) a;   if (a == 2) {     int j = i;     int i = 2; // Hides i, OK     int a;     // Hides a, error }  | 
 
  class B {   A a;   a.j = 1; // Error, j not a class member   int j = a.a + 1; // OK   j = a.i + 2      // OK   int(3) b; }  | 
|  | 
In summary, the scoping rules have the following two special considerations. Parsable variables do not obey scoping rules and are always considered class members. Non-parsable variables obey the standard scoping rules and are considered class members only if they are at the top-level scope of the class.
Note that parameter type variables are considered as having the top-level scope of the class. Also, they are not allowed to hide the object identifier, if any.
Up to now, we have only considered fixed-length representations, either constant or parametric. A wide variety of representation schemes, however, rely heavily on entropy coding, and in particular Huffman codes [1]. These are variable-length codes (VLCs), which are uniquely decodable (no codeword is the prefix of another). Flavor provides extensive support for variable-length coding through the use of maps. These are declarations of tables in which the correspondence between codewords and values is described.
Figure 4.10 gives a simple example of a map declaration. The map keyword indicates the declaration of a map named A. The declaration also indicates that the map converts from bit string values to values of type int. The type indication can be a fundamental type, a class type, or an array. Map declarations can only occur in global scope. As a result, an array declaration will have to have a constant size (no non-constant variables are visible at this level). After the map is properly declared, we can define parsable variables that use it by indicating the name of the map where we would put the parse size expression as follows: 'int (A) i;'. As we can see the use of variable-length codes is essentially identical to fixed-length variables. All the details are hidden away in the map declaration.
|  | 
 map A(int) {   0b0,  1,   0b01, 2 }   |  | 
The map contains a series of entries. Each entry starts with a bit string that declares the codeword of the entry followed by the value to be assigned to this codeword. If a complex type is used for the mapped value, then the values have to be enclosed in curly braces. Figure 4.11 shows the definition of a VLC table with a user-defined class as output type. The type of the variable has to be identical to the type returned from the map. For example, using the declaration - YUVblocks(blocks_per_component) chroma_format; - we can access a particular value of the map using the construct: chroma_format.Ublocks.
|  | 
 // The output type of a map is defined in a class class YUVblocks {   unsigned int Yblocks;   unsigned int Ublocks;   unsigned int Vblocks; } /* A table that relates the chroma format with  * the number of blocks per signal component  */ map blocks_per_component(YUVblocks) {   0b00, {4,1,1}, // 4:2:0   0b01, {4,2,2}, // 4:2:2   0b10, {4,4,4}  // 4:4:4   |  | 
As Huffman codeword lengths tend to get very large when their number increases, it is typical to specify "escape codes," signifying that the actual value will be subsequently represented using a fixed-length code. To accommodate these as well as more sophisticated constructs, Flavor allows the use of parsable type indications in map values. This means that, using the example in Figure 4.10, we can write the example in Figure 4.12. This indicates that, when the bit string 0b001 is encountered in the bitstream, the actual return value for the map will be obtained by parsing 5 more bits. The parse size for the extension can itself be a map, thus allowing the cascading of maps in sophisticated ways. Although this facility is efficient when parsing, the bitstream generation operation can be costly when complex map structures are designed this way. None of today's specifications that we are aware of require anything beyond a single escape code.
|  | 
 map A(int) {   0b0,   1,   0b0l,  2,   0b001, int(5) }   |  | 
The translator can check that the VLC table is uniquely decodable, and also generate optimized code for extremely fast encoding/decoding using our hybrid approach as described in [16].
