Previous | Table of Contents | Next |
The compiler has already performed register renaming and register coalescing, reduced the register pressure, and scheduled the instructions. It is time to assign registers to each temporary. The register allocator must satisfy the following constraints in order of importance:
Many compilers take a simplistic view of register allocation. They describe register allocation in terms of some algorithmic problemsuch as graph coloring or bin packingand then use some heuristic solution for that particular formulation. Such register allocators perform well on problems needing few registers; however, if the number of registers needed is significantly greater than the number of registers available, each of these register allocation methods generates a large number of spill instructions, namely, the loads and stores to memory generated by the register allocator.
The problem is that each of these allocation techniques uses one of the two types of information available. The graph-coloring allocators use the concept of interference or conflict graphs. The conflict graph has no concept of which instructions are near each other, so it performs poorly on blocks. The bin-packing register allocators perform well on blocks, but have to use an approximation to handle control flow. It is possible to create situations where one algorithm will work better than another. This approach to register allocation was chosen to expose the best attributes of each of the algorithms.
This compiler combines the two. Recall that the compiler has already inserted spilling instructions to reduce the register pressure to less than or equal to the number of registers available. The compiler will now use three distinct allocation algorithms to allocate registers:
By separating the assignment of local and global temporaries, the compiler introduces the possibility of a phase-ordering problem: The assignment of global temporaries may inhibit the assignment of local temporaries. This is unavoidable since the optimal allocation of registers is an NP-complete problem. The design is such that the particular choice of algorithms to use will avoid as much of the problem as is possible.
To illustrate the interplay between global and local register allocation, consider the pictorial representation of a block in Figure 13.1 The set of instructions where a temporary is live is represented by a horizontal distance. Each temporary is represented by a distinct row. The global register allocator will create situations such as R1, R2, and R4. R1 contains a value assigned in another block and used for the last time in this block. R2 is assigned a value in this block and used in another block. R4 combines the two: R4 is assigned a value in this block, control flow leaves the block, and returns to the block using the value earlier in the block. R3 is a typical local temporary. It is assigned a value in the block and used for the last time later in the block. In large procedures, this is most of the temporaries. R1, R2, and R4 are allocated by the global allocator. R2 and R4 are combined with other local temporaries by the FAT algorithm. R3 is allocated using the local allocator.
Figure 13.1 Pictorial Representation of Block
Figure 13.2 Driver for Register Allocation
Recall that all of these algorithms are approximations to an optimal allocation. An optimal allocation can be discovered by solving an integer programming problem; however, this technique is too expensive for a production compiler.
The main algorithm for register allocation is to perform each form of allocation in turn. The FAT algorithm and the local register allocator work together by the FAT algorithm creating data structures that are used by the local register allocator. The calling structure is shown in Figure 13.2. First, perform global register allocation and then apply the FAT algorithm and local register allocation algorithms on each block.
Previous | Table of Contents | Next |