For about seven years my family and I lived in the California Sierras. During that time I developed a passion for rustic mountain living as well as most environmental associations related to the mountains and the Old West: snow, historic mining towns, coaches and wagons, treasure hunting, narrow gauge railroads, wildland fire fighting, and other miscellaneous rural benefits. Now that I have milled that Old West imagery into your mind "ore" bored you to death, you can continue reading what I fondly refer to as the mangling of bits. This is one of my favorite sections, because with their use, I typically devise speedier methods for use in the manipulation of data. I label this "thinking out of the box," sometimes referred to as "Thar's gold in them thar bits!"
Bit mangling relates to the individual twiddling of bits using Boolean logic such as NOT, AND, OR, XOR, or some combination thereof. Each bit is individually isolated so no bit affects any adjacent bit encapsulated by the register. In vector mathematical operations, groups of bits are isolated so that an operation on one vector group does not affect another. Boolean operations are similar but on an individual bit basis. Each group in this case is a group of one bit; thus, an operation to a single bit does not affect an adjacent bit. This is why there are no vector type Boolean operations. There are operations that do, however, use 32/64-bit general-purpose registers, 64-bit MMX registers, and 128-bit SSE2 registers for Boolean operations so as to manipulate more bits simultaneously in parallel.
In a manner of speaking, all processors that support Boolean operations on pairs of bits have a degree of SIMD support.
These are heavily used by algorithms utilizing vectors, which is why they are in this book. Included in this chapter are the electronic symbols for each logical operation. Typically, I use my own books for reference, and from time to time I have found that drawing logic circuits using digital logic symbols actually makes more complex Boolean logic algorithms easier for me to simplify. Maybe it will work the same for you.
Any processor professing to contain a multimedia, SIMD, packed, parallel, or vector instruction set will contain almost all of the following instructions in one form or another. Parallel instructions typically do not have a Carry flag as found in some of the older scalar based instruction sets using general-purpose registers, such as the 80x86. They tend to lose overflows through the shifting out of bits, wraparound of data, or saturation. Another item to note is that not all the displayed diagrams are used by all the various 80x86 processors defined. Over the years the functionality has been enhanced, so older processors will not have the same abilities as the newer processors.
It must be reiterated that you watch the alignment of your data objects in memory very closely. It takes extra overhead to adjust the memory into an aligned state and it is a lot more efficient to ensure that they are aligned in the first place. Your code will be smaller and faster! This will be made obvious by the sample code included in this chapter.
You will find in this chapter and throughout this book sections titled "Pseudo Vec." As processors are enhanced, new superset functionality is given to them such as SIMD operations. Some of you, however, are still programming for older processors and need the newer functionality, while some of you require a more in-depth understanding of vector operations. The "Pseudo Vec" sections are for you!
Workbench Files: \Bench\x86\chap04\ project \ platform