11.7 Packed Arrays of Bit Strings

Although it is far more efficient to create arrays whose elements have an integral number of bytes, it is quite possible to create arrays of elements whose size is not a multiple of 8 bits. The drawback is that calculating the "address" of an array element and manipulating that array element involves a lot of extra work. In this section we'll take a look at a few examples of packing and unpacking array elements in an array whose elements are an arbitrary number of bits long.

Before proceeding, it's probably worthwhile to discuss why you would want to bother with arrays of bit objects. The answer is simple: space. If an object only consumes 3 bits, you can get 2.67 times as many elements into the same space if you pack the data rather than allocating a whole byte for each object. For very large arrays, this can be a substantial savings. Of course, the cost of this space savings is speed: You've got to execute extra instructions to pack and unpack the data, thus slowing down access to the data.

The calculation for locating the bit offset of an array element in a large block of bits is almost identical to the standard array access; it is

 Element_Address_in_bits = Base_address_in_bits + index * element_size_in_bits

Once you calculate the element's address in bits, you need to convert it to a byte address (because we have to use byte addresses when accessing memory) and extract the specified element. Because the base address of an array element (almost) always starts on a byte boundary, we can use the following equations to simplify this task:

 Byte_of_1st_bit = Base_Address + (index * element_size_in_bits )/8 Offset_to_1st_bit = (index * element_size_in_bits) % 8 (note "%" = MOD)

For example, suppose we have an array of 200 3-bit objects that we declare as follows:

 static      AO3Bobjects: byte[ (200*3)/8 + 1 ]; // "+1" handles trucation.

The constant expression in the dimension above reserves space for enough bytes to hold 600 bits (200 elements, each 3 bits long). As the comment notes, the expression adds an extra byte at the end to ensure we don't lose any odd bits (that won't happen in this example because 600 is evenly divisible by 8, but in general you can't count on this; one extra byte usually won't hurt things).

Now suppose you want to access the ith 3-bit element of this array. You can extract these bits by using the following code:

 // Extract the ith group of three bits in AO3Bobjects and leave this value // in EAX.           sub( ecx, ecx );      // Put i/8 remainder here.           mov( i, eax );        // Get the index into the array.           shrd( 3, eax, ecx );  // EAX/8 -> EAX and EAX mod 8 -> ECX (H.O. bits)           shr( 3, eax );        // Remember, shrd above doesn't modify eax.           rol( 3, ecx );        // Put remainder into L.O. three bits of ECX.           // Okay, fetch the word containing the three bits we want to extract.           // We have to fetch a word because the last bit or two could wind up           // crossing the byte boundary (i.e., bit offset six and seven in the           // byte).           mov( AO3Bobjecs[eax], eax );           shr( cl, eax );       // Move bits down to bit zero.           and( %111, eax );     // Remove the other bits.

Inserting an element into the array is a bit more difficult. In addition to computing the base address and bit offset of the array element, you've also got to create a mask to clear out the bits in the destination where you're going to insert the new data. The following code inserts the L.O. 3 bits of EAX into the ith element of the AO3Bobjects array.

 // Insert the L.O. three bits of AX into the ith element of AO3Bobjects: readonly      Masks: word[8] :=               [                    !%0000_0111, !%0000_1110, !%0001_1100, !%0011_1000,                    !%0111_0000, !%1110_0000, !%1_1100_0000, !%11_1000_0000               ];                    .                    .                    .           sub( ecx, ecx );      // Put remainder here.           mov( i, ebx );        // Get the index into the array.           shrd( 3, ebx, ecx );  // i/8 -> EBX, i % 8 -> ECX.           shr( 3, ebx );           rol( 3, ecx );           and( %111, ax );                 // Clear unneeded bits from AX.           mov( Masks[ecx], dx );           // Mask to clear out our array element.           and( AO3Bobjects[ ebx ], dx );   // Grab the bits and clear those                                            // we're inserting.           shl( cl, ax );        // Put our three bits in their proper location.           or( ax, dx );         // Merge bits into destination.           mov( dx, AO3Bobjects[ ebx ] );  // Store back into memory.

Notice the use of a look-up table to generate the masks needed to clear out the appropriate position in the array. Each element of this array contains all ones except for three zeros in the position we need to clear for a given bit offset (note the use of the "!" operator to invert the constants in the table).