Section 4.7. Linux Process Memory Structures


4.7. Linux Process Memory Structures

Until now, we covered how the kernel manages its own memory. We now turn our attention to user space programs and how the kernel manages program memory. The wonder of virtual memory allows a user space process to effectively operate as though it has access to all memory. In reality, the kernel manages what gets loaded, how it gets loaded, and when that happens. All that we have discussed until now in this chapter relates to how the kernel manages memory and is completely transparent to a user space program.

Upon creation, a user space process is assigned a virtual address space. As previously mentioned, a process' virtual address space is a range of unsegmented linear addresses that the process can use. The size of the range is defined by the size of the register in the system's architecture. Most systems have a 32-bit address space. A G5, for example, has processes with a 64-bit address space.

Upon creation, the address range the process is presented with can grow or shrink through the addition or removal of linear address intervals, respectively. The address interval (a range of addresses) represents yet another unit of memory called a memory region or a memory area. It is useful to split the process address range into areas of different types. These different types have different protection schemes or characteristics. The process-memory protection schemes are associated with process context. For example, certain parts of a program's code are marked read-only (text) while others are writable (variables) or executable (instructions). Also, a particular process might access only certain memory areas that belong to it.

Within the kernel, a process address space, as well as all the information related to it, is kept in an mm_struct descriptor. You might recall from Chapter 3, "Processes: The Principal Model of Execution," that this structure is referenced in the task_struct for the process. A memory area is represented by the vm_area_struct descriptor. Each memory area descriptor describes the contiguous address interval it represents. Throughout this section, we refer to the descriptor for an address interval as a memory area descriptor or as vma_area_struct.We now look at mm_struct and vm_area_struct. Figure 4.10 illustrates the relationship between these data structures.

Figure 4.10. Process-Related Memory Structures


4.7.1. mm_struct

Every task has an mm_struct (include/linux/sched.h) structure that the kernel uses to represent its memory address range. All mm_struct descriptors are stored in a doubly linked list. The head of the list is the mm_struct that corresponds to process 0, which is the idle process. This descriptor is accessed by way of the global variable init_mm:

 ----------------------------------------------------------------------------- include/linux/sched.h 185  struct mm_struct { 186   struct vm_area_struct * mmap; 187   struct rb_root mm_rb; 188   struct vm_area_struct * mmap_cache; 189   unsigned long free_area_cache; 190   pgd_t * pgd; 191   atomic_t mm_users; 192   atomic_t mm_count; 193   int map_count; 194   struct rw_semaphore mmap_sem; 195   spinlock_t page_table_lock 196 197   struct list_head mmlist; ... 202   unsigned long start_code, end_code, start_data, end_data; 203   unsigned long start_brk, brk, start_stack; 204   unsigned long arg_start, arg_end, env_start, env_end; 205   unsigned long rss, total_vm, locked_vm; 206   unsigned long def_flags; 207   cpumask_t cpu_vm_mask; 208   unsigned long swap_address; ... 228  };  ----------------------------------------------------------------------------- 

4.7.1.1. mmap

The memory area descriptors (which are defined in the next section) that have been assigned to a process are linked in a list. This list is accessed by means of the mmap field in the mm_struct. The list is traversed by way of the vm_next field of each vma_area_struct.

4.7.1.2. mm_rb

The simply linked list provides an easy way of traversing all the memory area descriptors that correspond to a particular process. However, if the kernel searches for a particular memory area descriptor, a simply linked list does not yield good search times. The memory area structures that correspond to a process address range are also stored in a red-black tree that is accessed through the mm_rb field. This yields faster search times when the kernel needs to access a particular memory area descriptor.

4.7.1.3. mmap_cache

mmap_cache is a pointer to the last memory area referenced by the process. The principle of locality states that when a memory address is referenced, memory areas that are close by tend to get referenced soon after. Hence, it is likely that the address being currently checked belongs to the same memory area as the last address checked. The hit rate of verifying whether the current address is in the last accessed memory area is approximately 35 percent.

4.7.1.4. pgd

The pgd field is a pointer to the page global directory that holds the entry for this memory area. In the mm_struct for the idle process (process 0), this field points to the swapper_pg_dir. See Section 4.9 for more information on what this field points to.

4.7.1.5. mm_users

The mm_users field holds the number of processes that access this memory area. Lightweight processes or threads share the same address intervals and memory areas. Thus, the mm_struct for threads generally have an mm_users field with a value greater than 1. This field is manipulated by way of the atomic functions: atomic_set(), atomic_dec_and_lock(), atomic_read(), and atomic_inc().

4.7.1.6. mm_count

mm_count is the usage count for the mm_struct. When determining if the structure can be deallocated, a check is made against this field. If it holds the value of 0, no processes are using it; therefore, it can be deallocated.

4.7.1.7. map_count

The map_count field holds the number of memory areas, or vma_area_struct descriptors, in the process address space. Every time a new memory area is added to the process address space, this field is incremented alongside with the vma_area_struct's insertion into the mmap list and mm_rb tree.

4.7.1.8. mm_list

The mm_list field of type list_head holds the address of adjacent mm_structs in the memory descriptor list. As previously mentioned, the head of the list is pointed to by the global variable init_mm, which is the memory descriptor for process 0. When this list is manipulated, mmlist_lock protects it from concurrent accesses.

The next 11 fields we describe deal with the various types of memory areas a process needs allocated to it. Rather than digress into an explanation that distracts from the description of the process memoryrelated structures, we now give a cursory description.

4.7.1.9. start_code and end_code

The start_code and end_code fields hold the starting and ending addresses for the code section of the processes' memory region (that is, the executable's text segment).

4.7.1.10. start_data and end_data

The start_data and end_data fields contain the starting and ending addresses for the initialized data (that found in the .data portion of the executable file).

4.7.1.11. start_brk and brk

The start_brk and brk fields hold the starting and ending addresses of the process heap.

4.7.1.12. start_stack

start_stack is the starting address of the process stack.

4.7.1.13. arg_start and arg_end

The arg_start and arg_end fields hold the starting and ending addresses of the arguments passed to the process.

4.7.1.14. env_start and env_end

The env_start and env_end fields hold the starting and ending addresses of the environment section.

This concludes the mm_struct fields that we focus on in this chapter. We now look at some of the fields for the memory area descriptor, vm_area_struct.

4.7.2. vm_area_struct

The vm_area_struct structure defines a virtual memory region. A process has various memory regions, but every memory region has exactly one vm_area_struct to represent it:

 ----------------------------------------------------------------------------- include/linux/mm.h 51  struct vm_area_struct { 52   struct mm_struct * vm_mm; 53   unsigned long vm_start; 54   unsigned long vm_end; ... 57   struct vm_area_struct *vm_next; ... 60   unsigned long vm_flags; 61 62   struct rb_node vm_rb; ... 72   struct vm_operations_struct * vm_ops; ... }; ----------------------------------------------------------------------------- 

4.7.2.1. vm_mm

All memory regions belong to an address space that is associated with a process and represented by an mm_struct. The structure vm_mm points to a structure of type mm_struct that describes the address space to which this memory area belongs to.

4.7.2.2. vm_start and vm_end

A memory region is associated with an address interval. In the vm_area_struct, this interval is defined by keeping track of the starting and ending addresses. For performance reasons, the beginning address of the memory region must be a multiple of the page frame size. The kernel ensures that page frames are filled with data from a particular memory region by also demanding that the size of memory region be in multiples of the page frame size.

4.7.2.3. vm_next

The field vm_next points to the next vm_area_struct in the linked list that comprises all the regions within a process address space. The head of this list is referenced by way of the mmap field in the mm_struct for the address space.

4.7.2.4. vm_flags

Within this interval, a memory region also has associated characteristics that describe it. These are stored in the vm_flags field and apply to the pages within the memory region. Table 4.6 describes the possible flags.

4.7.2.5. vm_rb

vm_rb holds the red-black tree node that corresponds to this memory area.

4.7.2.6. vm_ops

vm_ops consists of a structure of function pointers that handle the particular vm_area_struct. These functions include opening the memory area, closing, and unmapping it. Also, it holds a function pointer to the function called when a no-page exception occurs.




The Linux Kernel Primer. A Top-Down Approach for x86 and PowerPC Architectures
The Linux Kernel Primer. A Top-Down Approach for x86 and PowerPC Architectures
ISBN: 131181637
EAN: N/A
Year: 2005
Pages: 134

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net