Bit Manipulation | Agile Javaв„ў: Crafting Code with Test-Driven Development

Java supports bit-level operations on integers. Among myriad other uses, you might use bit manipulation for mathematical calculations, for performance (some mathematical operations may be faster using bit shifting), in encryption, or for working with a compact collection of flags.

Binary Numbers

Your computer ultimately stores all numbers internally as a series of bits. A bit is represented by a binary, or base 2, number. A binary number is either 0 or 1. The following table on top of page 381 counts in binary from 0 through 15.

The table also shows the hexadecimal ("hex") representation of each number. Java does not allow you to use binary literals, so the easiest way to understand bit operations is to represent numbers with hexadecimal (base 16) literals. Each hex digit in a hex literal represents four binary digits.

Java Binary Representation

Java represents an int as 32 binary digits. The leading bit indicates whether the int is positive (0) or negative (1). Java stores positive integers as pure bi naryeach binary digit adds into a positive sum. Negative integers are stored in two's complement form. To obtain the two's complement representation of a number, take its positive binary representation, invert all binary digits, and add 1.

Decimal	Binary	Hex	Decimal	Binary	Hex
0	0	0x0	8	1000	0x8
1	1	0x1	9	1001	0x9
2	10	0x2	10	1010	0xA
3	11	0x3	11	1011	0xB
4	100	0x4	12	1100	0xC
5	101	0x5	13	1101	0xD
6	110	0x6	14	1110	0xE
7	111	0x7	15	1111	0xF

For example, the number 17 is represented in two's complement as:

 0000_0000_0000_0000_0000_0000_0001_0001^[3]

^[3] I have added underscores for readability.

The number -17 is represented in two's complement as:

 1111_1111_1111_1111_1111_1111_1110_1111

Logical Bit Operations

Java contains four logical operators that allow you to manipulate bits: bit-and, bit-or, negation, and bit-xor (exclusive-or). A truth table defines how each logical bit operator works, by specifying the results for all possible bit pairings.

The bit-and, bit-or, and bit-xor logical operators operate on two integers; they are therefore referred to as binary operators. The negation operator operates on a single integer and is thus a unary operator.

To execute a binary logical operation, Java compares all corresponding bits from each of the two numbers involved. When doing a bitwise-and operation on two 32-bit integers, then, this means that 32 individual bit-ands take place. To execute a unary logical operation against an integer, Java negates each bit in the integer individually.

You can perform logical bit operations against two integers of type int or smaller. However, Java internally translates integers declared as smaller than intbyte, short, and charto type int before the bit operation takes place.

The bit-and operator (&) is also known as bitwise multiplication. If both bits are 1, bitwise multiplication returns a 1; otherwise it returns a 0. The following TDTT demonstrates bit-and.

 assertEquals(0, 0 & 0); assertEquals(0, 0 & 1); assertEquals(0, 1 & 0); assertEquals(1, 1 & 1);

The bit-or operator (|) is also known as bitwise addition. If both bits are 0, bitwise addition returns a 0; otherwise it returns a 1. The following TDTT demonstrates bit-or.

 assertEquals(0, 0 | 0); assertEquals(1, 0 | 1); assertEquals(1, 1 | 0); assertEquals(1, 1 | 1);

The xor (exclusive-or) operator (^) is also known as bitwise difference. If both bits are the same, bitwise difference returns 0; otherwise it returns a 1. The following TDTT demonstrates xor.

 assertEquals(0, 0 ^ 0); assertEquals(1, 0 ^ 1); assertEquals(1, 1 ^ 0); assertEquals(0, 1 ^ 1);

The logical negation operator (~) flips all bits in the integer so that 0s become 1s and 1s become 0s.

 int x = 0x7FFFFFF1;           //0111_1111_1111_1111_1111_1111_1111_0001 assertEquals(0x8000000E, ~x); //1000_0000_0000_0000_0000_0000_0000_1110

Java supports the use of compound assignment with logical bit operators. Thus, x &= 1 is equivalent to x = x & 1.

Using Bit-And, Bit-Or and Negation

If you have a series of related flags (boolean values) to represent, you can use an integer variable to compactly represent them. A less-efficient alternative would be to define each as a separate boolean variable or as an array of booleans. Each bit in the integer represents a different flag. Thus, you could represent eight flags in a single byte. For example, the binary value 00000001 would mean that the first flag is on and all other flags are off. The binary value 00000101 would mean that the first and third flags are on and all other flags are off.

To set the values of each flag, you first define a mask for each binary position. The mask for the first position is 00000001 (decimal 1), the mask for the second position is 00000010 (decimal 2), and so on. Setting a value is done using bit-or, and extracting a value is done using bit-and.

You need to be able to set four yes-no flags on each student: Does the student reside on campus, is the student tax-exempt, is the student a minor, and is the student a troublemaker? You have 200,000 students and memory is at a premium.

 public void testFlags() {    Student student = new Student("a");    student.set(       Student.Flag.ON_CAMPUS,       Student.Flag.TAX_EXEMPT,       Student.Flag.MINOR);    assertTrue(student.isOn(Student.Flag.ON_CAMPUS));    assertTrue(student.isOn(Student.Flag.TAX_EXEMPT));    assertTrue(student.isOn(Student.Flag.MINOR));    assertFalse(student.isOff(Student.Flag.ON_CAMPUS));    assertTrue(student.isOff(Student.Flag.TROUBLEMAKER));    student.unset(Student.Flag.ON_CAMPUS);    assertTrue(student.isOff(Student.Flag.ON_CAMPUS));    assertTrue(student.isOn(Student.Flag.TAX_EXEMPT));    assertTrue(student.isOn(Student.Flag.MINOR)); }

You should normally prefer explicit query methods for each flag (isOnCampus, isTroublemaker, etc.). The above approach may be preferable if you have a large number of flags and more dynamic needs.

The relevant code in Student follows.

 public enum Flag {    ON_CAMPUS(1),    TAX_EXEMPT(2),    MINOR(4),    TROUBLEMAKER(8);    private int mask;    Flag(int mask) {       this.mask = mask;    } } private int settings = 0x0; ... public void set(Flag... flags) {    for (Flag flag: flags)       settings |= flag.mask; } public void unset(Flag... flags) {    for (Flag flag: flags)       settings &= ~flag.mask; } public boolean isOn(Flag flag) {    return (settings & flag.mask) == flag.mask; } public boolean isOff(Flag flag) {    return !isOn(flag); }

Each flag is an enum constant, defined by the Flag enum located in Student. Each enum instantiation passes an integer representing a mask to the constructor of Flag. You store the flags in the int instance variable settings. You explicitly initialize this variable to all 0s to demonstrate intent.

The set method takes a variable number of Flag enum objects. It loops through the array and applies the Flag's mask to the settings variable by using a bit-or operator.

The unset method loops through Flag enum objects and bit-ands the negation of the Flag's mask to the settings variable.

A demonstration of how the bit-or operation sets the correct bit:

`Flag.MINOR`	`0100`
`settings`	`0000`
`settings = settings \| Flag.MINOR`	`0100`
`Flag.ON_CAMPUS`	`0001`
`settings \| Flag.ON_CAMPUS`	`0101`

The isOn method bit-ands a Flag's mask with the settings variable.

`settings`	`0000`
`Flag.MINOR`	`0100`
`settings & Flag.MINOR`	`0000`
`settings & Flag.MINOR == Flag.MINOR`	`false`
`settings = settings \| Flag.MINOR`	`0100`
`settings & Flag.MINOR`	`0100`
`settings & Flag.MINOR == Flag.MINOR`	`TRue`

Finally, a demonstration of the use of negation in the unset method:

`settings`	`0101`
`~Flag.MINOR`	`1011`
`settings & ~Flag.MINOR`	`0001`

Using bit operations to store multiple flags is a classic technique that comes from a need to squeeze the most possible information into memory. It is not normally recommended or needed for most Java development, since an array or other collection of boolean values provides a clearer, simpler representation.

Using Xor

The xor operator has the unique capability of being reversible.

 int x = 5;            // 101 int y = 7;            // 111 int xPrime = x ^ y;   // 010 assertEquals(2, xPrime); assertEquals(x, xPrime ^ y);

You can also use xor as the basis for parity checking. Transmitting data introduces the possibility of individual bits getting corrupted. A parity check involves transmitting additional information that acts like a checksum. You calculate this checksum by applying an xor against all data sent. The receiver executes the same algorithm against the data and checksum. If checksums do not match, the sender knows to retransmit the data.

The parity check is binary: A stream of data has either even parity or odd parity. If the number of 1s in the data is even, parity is even. If the number of 1s in the data is odd, the parity is odd. You can use xor to calculate this parity. Here is a simple stand-alone test:

 public void testParity() {    assertEquals(0, xorAll(0, 1, 0, 1));    assertEquals(1, xorAll(0, 1, 1, 1)); } private int xorAll(int first, int... rest) {    int parity = first;    for (int num: rest)       parity ^= num;    return parity; }

Xoring an even number of 1s will always result in an even parity (0). Xoring an odd number of 1s will always result in an odd parity (1).

The reason this works: Xoring is the same as adding two numbers, then taking modulus of dividing by 2. Here is an extension of the truth table that demonstrates this:

`0 ^ 0 = 0`	`(0 + 0) % 2 = 0`
`0 ^ 1 = 1`	`(0 + 1) % 2 = 1`
`1 ^ 0 = 1`	`(1 + 0) % 2 = 1`
`1 ^ 1 = 1`	`(1 + 1) % 2 = 0`

Dividing mod 2 tells you whether a number is odd (1) or even (0). When you add a string of binary digits together, only the 1 digits will contribute to a sum. Thus, taking this sum modulus 2 will tell you whether the number of 1 digits is odd or even.

Take this one level further to calculate a checksum for any integer. The job of the ParityChecker class is to calculate a checksum for a byte array of data. The test shows how corrupting a single byte within a datum results in a different checksum.

 package sis.util; import junit.framework.*; public class ParityCheckerTest extends TestCase {    public void testSingleByte() {       ParityChecker checker = new ParityChecker();       byte source1  = 10;  // 1010       byte source2  = 13;  // 1101       byte source3  =  2;  // 0010       byte[] data = new byte[] {source1, source2, source3 };       byte checksum =  5;  // 0101       assertEquals(checksum, checker.checksum(data));       // corrupt the source2 element       data[1] = 14;        // 1110       assertFalse(checksum == checker.checksum(data));    } }

ParityChecker loops through all bytes of data, xoring each byte with a cumulative checksum. In the test, I lined up the binary translation of each decimal number (source1, source2, and source3). This allows you to see how the checksum of 5 is a result of xoring the bits in each column.

 package sis.util; public class ParityChecker {    public byte checksum(byte[] bytes) {       byte checksum = bytes[0];       for (int i = 1; i < bytes.length; i++)          checksum ^= bytes[i];       return checksum;    } }

A simple xor parity check will not catch all possible errors. More-complex schemes involve adding a parity bit to each byte transmitted in addition to the a parity byte at the end of all bytes transmitted. This provides a matrix scheme that can also pinpoint the bytes in error.

Bit Shifting

Java provides three operators for shifting bits either left or right.

To shift bits left or right, and maintain the sign bit, use the bit shift left (<<) or bit shift right (>>) operators.

A bit shift left moves each bit one position to the left. Leftmost bits are lost. Rightmost bits are filled with zeroes.

For example, shifting the bit pattern 1011 one position to the left in a 4-bit unsigned number would result in 0110. Some annotated Java examples:

                            //    101 = 5 assertEquals(10, 5 << 1);  //   1010 = 10 assertEquals(20, 5 << 2);  //  10100 = 20 assertEquals(40, 5 << 3);  // 101000 = 40 assertEquals(20, 40 >> 1);   assertEquals(10, 40 >> 2);                              // ... 1111 0110 = -10 assertEquals(-20, -10 << 1); // ... 1110 1100 = -20 assertEquals(-5, -10 >> 1);  // ... 1111 1011 = -5

You'll note that bit shifting left is the equivalent of multiplying by powers of 2. Bit shifting right is the equivalent of dividing by powers of 2.

The unsigned bit shift right shifts all bits irrespective of the sign bit. Rightmost bits are lost, and the leftmost bits (including the sign bit) are filled with 0.

 assertEquals(5, 10 >>> 1); assertEquals(2147483643, -10 >>> 1);    // 1111_1111_1111_1111_1111_1111_1111_0110 = -10    // 0111_1111_1111_1111_1111_1111_1111_1011 = 2147483643

Note that an unsigned bit shift right always results in a positive number.

Bit shifting has uses in cryptography and in graphical image manipulation. You can also use bit shifting for some mathematical operations, such as dividing or multiplying by powers of 2. Only do so if you need extremely fast performance, and then only if your performance tests show you that the math is the performance problem.

BitSet

The Java API library includes the class java.util.BitSet. It encapsulates a vector of bits and grows as needed. BitSet objects are mutableits individual bits may be set or unset. You can bit-and, bit-or, or bit-xor one BitSet object to another. You can also negate a BitSet. One of its few benefits is that it supports bit operations for numbers larger than the capacity of an int.