The 80x86 processor is the most complicated of all the processors due to its life span and constant upgrades and enhancements since its introduction to the marketplace on August 12, 1980. If the 1983 Charlie Chaplin promoter of the IBM personal computer with its 8088 were to see it now he would be proud and astonished at all the architectural changes that have occurred to it over the years .
For a processor to survive it must either be enhanced to meet the demands of technology, find a second life in an embedded marketplace, or die. Intel and AMD have done just that (not the die part), but unfortunately in the process, the technology has forked and so there are now a multitude of flavors of the original 80x86 processor core in existence. In addition, AMD has merged the technologies of the 3DNow! extensions and SSE(2) and formed the 3DNow! Professional instruction sets.
The point is that now there are several 80x86 SIMD feature sets and not all processors have them all. So the first step is to resolve this. Intel initially did so by developing an instruction with the mnemonic CPUID along with a set of sample processor detection code. AMD adopted the same instruction with their set of sample code. As the technology forked further, each company's sample CPUID code emphasized its own processors so programmers have had to merge both companies' code a bit, although AMD's was more diverse. To make it even more complicated, AMD put out 3DNow! Professional. This is a hybrid of all the 80x86 instruction sets of the AMD and Intel technologies, except the SSE (Extensions 3) (at least at the time this book was written). Because of the confusion factor, this book's goal is to try to make it easier to understand.
In Chapter 16 the CPUID instruction is explained. This is a very complicated instruction, but it is wrapped with a function call used by this book that fills in a structure and builds an ASCII string describing the capabilities of a computer in code. It or something similar to it should be used to decide whether a certain set of instructions is usable on a particular computer. Since you are most likely learning this subject material, then you are most likely using one or two computers to test this code and thus know the processor type already. Just to be sure you are running the correct instructions on the correct machine there is a CPUID testing logic included with most of the test applications to run the appropriate set of code. If you wish to learn more about this, please skip ahead to Chapter 16.
void CpuDetect(CpuInfo * const pInfo);
Briefly, the CPU detection code checks for the processor type and its capability and sets flags accordingly . The initialization function attaches function pointers to appropriate code compatible with that processor type and then it is just a matter of the application calling a generic function pointer, which gets routed to the appropriate code.
When you write your code, try to use SSE instructions whenever possible for scalar as well as vector processing. When possible use the instructions that perform quick estimations as they are designed for higher speed calculations despite their lower accuracy. In that way you will always have the best performance for your code, even on newer machines released after you ship your code.
The 80x86 processor has a dual mode in relationship to its MMX and FPU registers. In these particular cases whenever there is a need to switch back and forth, the appropriate instruction needs to be called. In addition, there is a difference between the AMD instruction FEMMS and the Intel instruction EMMS. (These will be discussed in Chapter 8.) When writing code, use instructions that favor using the SSE instructions as the (F)EMMS instructions are only needed if switching between MMX and FPU.