Chapter 19: System Programming in Windows | The Assembly Programming Master Book

Most of this chapter concentrates on memory management in Windows. Understanding this material requires you to have some background knowledge of the protected mode of Intel microprocessors. Therefore, i provide the basic information related to this topic. More detailed information about the protected mode can be found in [1, 3, 6, 8, 12]. Materials of this chapter also will be needed in Chapter 27 when considering kernel-mode drivers.

Page and Segment Addressing

I'll start describing page and segment addressing with a brief historical overview. The Intel ^[i] family originates from the Intel 8086 microprocessor. Currently, the seventh generation of this family is widely used. Every new generation differed from the previous one programmatically. Mainly, these differences lay in the extension of the command set. However, there were two stages in this evolution that played exceedingly important roles in the development of Intel-based computers. These were the 80286 microprocessor (protected mode) and the 80386 microprocessor (page addressing).

Before the arrival of the 80286 chip, microprocessors were used in so-called real mode of addressing. For programming, logical addresses were used, which consisted of two 16-bit components : segment and offset. The segment address could be stored in one of the four segment registers CS, DS, SS, or ES. The offset was stored in one of the index registers DI,. SI, BX, BP, or SP. ^[i] When accessing the memory, the logical address had to be converted: The segment address shifted 4 bytes to the left had to be added to the offset. As a result, a 20-bit address was obtained, which could span about 1 MB of memory ^[ii] (or, to be more precise, 1087 KB). MS-DOS was initially designed to work in this address space. The resulting 20-bit address was called linear, and it coincided with the physical address of the memory cell . Fig. 19.1 illustrates this mechanism of converting a logical address to a physical address.

Figure 19.1: Scheme of converting a logical address to a linear address in real addressing mode

Naturally, in the evolution of operating systems, this was a dead end. It was necessary to at least provide the possibility of extending the memory. It would be ideal not only to extend the memory but also to give all address spaces equal rights. Further-more, in real mode, the entire memory was available to any running application. Any error (or malicious intent) of the programmer could freeze or even crash the entire system. Introduction of the so-called protected mode provided a way out of this situation.

The greatness of this approach was that at first glance nothing changed. As before, the logical address was formed using segment registers and registers storing the offset. However, segment registers stored a so-called selector instead of the segment address. Part of this selector (13 bits) represented an index in a table called the descriptor table. The index pointed to the descriptor that stored full information about the segment. The size of the descriptor was sufficient for addressing considerably larger memory blocks.

Fig. 19.2 shows a scheme of converting the logical address into the linear address. Here, a 32-bit microprocessor was taken as a basis instead of the 16-bit processor used earlier. The descriptor table, or base address table, could have two typesglobal (GDT) or local (LDT). The type of the table depended on the second bit of the contents of the segment register. The GDT register ( GDTR ) pointed at the position of the GDT and its size. It was assumed that the contents of this register must not change after it was loaded. The GDT has to store descriptors of segments taken by the operating system. The address of the LDT was stored in the LDT register ( LDTR ). It was also assumed that there might be several LDTsone per running task. Thus, multitasking support was planned at the microprocessor level. The size of the GDTR is 48 bits, with 32 bits for the address of the GDT and 16 bits for its size.

Figure 19.2: Scheme of converting a local address to a linear address in protected addressing mode

In addition to the GDT, provision was made for another common system tablethe Interrupt Descriptor Table (IDT). It contains descriptors of special system objects known as gateways and defines entry points for the interrupt and exception handling procedures. The position of the IDT depends on the contents of the IDT register, the structure of which is similar to the GDTR .

The size of the LDTR is 10 bytes. ^[i] The first 2 bytes address the LDT indirectly through the GDT, which means that they play the role of selector for each newlycreated task. Thus, an element must be added to the GDT that would determine the segment storing the LDT of the current task. Switching between tasks can take place only by modifying the contents of the LDTR . Hence, if only one task is going to run in protected mode, it doesn't need to use LDTs and the LDTR .

The segment descriptor contained, in particular, the access field that defined the type of the segment being indexed (code segment, data segment, system segment, etc.). This information allows, for instance, the current segment to be specified as read-only. It also takes into account the possibility that the segment may be missing from memory (which means that it has been temporarily flushed to the disk). This makes provision for implementing virtual memory.

Thus, the protected mode provided the following advantages:

There was the possibility of having an individual system of segments for each task. The microprocessor made provision for fast switching between tasks. In addition, it was assumed that there would be segments belonging to the operating system.
The protected mode assumed that segments might be write-protected.
In the access field, it was possible to specify the access level. In total, there are four access levels. The idea of the access level was that the current task could not access a segment that has a higher access level.
Finally, this scheme made provision for implementing virtual memory, the memory formed with the possibility of temporarily storing the segment on disk. With this possibility, the logical address space can be very large.

Consider Fig. 19.2. From this scheme, it follows that the linear address is obtained by conversion. However, in contrast to the 80286 processor, in which it was still possible to equate the linear address to the physical address, for 80386, this is no longer an option.

With the Intel 80386 microprocessor, another mechanism of address conversion appearedpage addressing. To make the page addressing mechanism work, the most significant bit of the CRO system register must be set to 1.

Consider Fig. 19.3. The linear address obtained using the descriptor conversion (Fig. 19.2) is divided into three parts . The 10 most significant bits are used as an index in the table called the page table directory. The location of the page directory is defined by the contents of the CR3 register. The directory is composed of descriptors, the maximum number of which is 1024. The number of directories can be infinite; however, only the directory pointed by the CR3 register is active.

Figure 19.3: Converting a linear address to a physical address and accounting for page addressing

The next 10 bits of the linear address are intended for indexing the page table, which contains 1024 page descriptors. These descriptors, in turn , specify the physical addresses of the pages. Page size is usually 4 KB. Thus, it is easy to compute the address space that can be spanned by one page table directory. It equals 1024 — 1024 — 1024 — 4 bytes, or about 4 GB.

The 12 least significant bits define the offset within a page. As can be easily noticed, this makes exactly 4 KB (4095 bytes). Naturally, you probably have guessed that every process must have its own page table directory. You can switch between processes by changing the contents of the CR3 register. However, this is inefficient because it requires a vast amount of memory. In reality, to switch between processes the page table directory is changed.

Now, consider the structure of the page descriptors (the page table descriptors have the same structure):

Bits 12-31Address of the page that will be added to the offset after shifting 12 bits.
Bits 9-11 are intended for the use of the operating system.
Bits 7-8 are reserved and must be set to zero.
Bit 6Set if a record was inserted into the directory or page.
Bit 5Set before reading from or writing to a page.
Bit 4Disables caching.
Bit 3The write-through bit.
Bit 2If this bit is set to zero, the page relates to the supervisor; if it is set to one, then the page relates to the working process. This provides two access levels.
Bit 1If it is set, then writing to the page is allowed.
Bit 0If this bit is set, then the page is in the memory. Pages that contain data are flushed to the disk and read when an attempt is made to access them. Pages that contain code are not flushed to the disk. However, they can be swapped from appropriate modules stored on the disk. Therefore, the memory taken by these pages can be used rationally.

^[i] I also am referring to Intel-compatible microprocessors from other manufacturers.

^[i] In a narrow sense, only the DI and SI registers can be considered index registers.

^[ii] Once upon a time, it seemed that 1 MB was a vast amount of memory.

^[i] In older models of the Intel microprocessor, this register was only 2 bytes long.