Section 4.8. Process Image Layout and Linear Address Space | The Linux Kernel Primer. A Top-Down Approach for x86 and PowerPC Architectures

4.8. Process Image Layout and Linear Address Space

When a user space program is loaded into memory, it has its linear address space partitioned into various memory areas or segments. These segments are determined by functional differences in relation to the execution of the process. The functionally separated segments are mapped within the process address space. Six main segments are related to process execution:

Text. This segment, also known as the code segment, holds the executable instructions of a program. As such, it has execute and read attributes. In the case that multiple processes can be loaded from a single program, it would be wasteful to load the same instructions twice. Linux allows for multiple processes to share this text segment in memory. The start_code and end_code fields of the mm_struct hold the addresses for the beginning and end of the text segment.

Data. This section holds all initialized data. Initialized data includes statically allocated and global data that are initialized. The following code snippet shows an example of initialized data:

 --------------------------------------------------------------------------- example1.c int gvar = 10; int main(){ ... } -----------------------------------------------------------------------------

gvar. A global variable that is initialized and stored in the data segment. This section has read/write attributes but cannot be shared among processes running the same program. The start_data and end_data fields of the mm_struct hold the addresses for the beginning and end of the data segment.
BSS. This section holds uninitialized data. This data consists of global variables that the system initializes with 0s upon program execution. Another name for this section is the zero-initialized data section. The following code snippet shows an example of non-initialized data:
```
 --------------------------------------------------------------------------- example2.c  int gvar1[10]; long gvar2; int main() { ... } ----------------------------------------------------------------------------- 
```
Objects in this segment have only name and size attributes.
Heap. This is used to grow the linear address space of a process. When a program uses malloc() to obtain dynamic memory, this memory is placed in the heap. The start_brk and brk fields of the mm_struct hold the addresses for the beginning and end of the heap. When malloc() is called to obtain dynamic memory, a call to the system call sys_brk() moves the brk pointer to its new location, thus growing the heap.
Stack. This contains all the local variables that get allocated. When a function is called, the local variables for that function are pushed onto the stack. As soon as a function ends, the variables associated with the function are popped from the stack. Other information, including return addresses and parameters, is also stored in the stack. The field start_stack of the mm_struct marks the starting address of the process stack.

Although six main areas are related to process execution, they only map to three memory areas in the address space. These memory areas are called text, data, and stack. The data segment includes the executable's initialized data segment, the bss, and the heap. The text segment includes the executable's text segment.Figure 4.11 shows what the linear address space looks like and how the mm_struct keeps track of these segments.

Figure 4.11. Process Address Space

The various memory areas are mapped in the /proc filesystem. The memory map of a process may be accessed through the output of /proc/<pid>/maps. We now look at an example program and see the list of memory areas in the process' address space. The code in example3.c shows the program being mapped.

 ----------------------------------------------------------------------------- example3.c #include <stdio.h> int main(){   while(1);   return(0); } -----------------------------------------------------------------------------

The output of /proc/<pid>/maps for our example yields what's shown in Figure 4.12.

Figure 4.12. cat /proc/<pid>/maps

The left-most column shows the range of the memory segment. That is, the starting and ending addresses for a particular segment. The next column shows the access permissions for that segment. These flags are similar to the access permissions on files: r stands for readable, w stands for writeable, and x stands for executable. The last flag can be either a p, which indicates a private segment, or s, which indicates a shared segment. (A private segment is not necessarily unshareable.) The p indicates only that it is currently not being shared. The next column holds the offset for the segment. The fourth column from the left holds two numbers separated by a colon. These represent the major and minor numbers of the filesystem the file associated with that segment is found in. (Some segments do not have a file associated with them and, hence, just fill in this value with 00:00.) The fifth column holds the inode of the file and the sixth and right-most column holds the filename. For segments with no filename, this column is empty and the inode column holds a 0.

In our example, the first row holds a description of the text segment of our sample program. This can be seen on account of the permission flags set to executable. The next row describes our sample program's data segment. Notice that its permissions indicate that it is writeable.

Our program is dynamically linked, which means that functions it uses belonging to a library are loaded at runtime. These functions need to be mapped to the process' address space so that it can access them. The next six rows deal with dynamically linked libraries. The next three rows describe the ld library's text, data, and bss. These three rows are followed by descriptions of libc's test, data, and bss segments in that order.

The final row, whose permissions indicated that it is readable, writeable, and executable, represents the process stack and extends up to 0xC0000000. 0xC000000 is the highest memory address accessible for user space processes.