The Process s Logical View

The Process's Logical View

In order to facilitate the concept of relocatable code, compiled programs are linked against a logical memory image map. For 32-bit narrow applications, the logical size is 2^32, or 4 GB; in the case of 64-bit wide applications, the theoretical size is limited to 2^64, or 16 exabytes! In either case, the logical map is divided into four quadrants of equal size. This is not an arbitrary configuration. It is dictated by the underlying design of the HP PA-RISC hardware platform and its virtual memory system.

We discuss the virtual memory-to-physical memory mapping in the next chapter, but first we must look at the rules and options that control the placement of a program's various memory objects into a logical space. Figure 5-4 illustrates this concept for both narrow and wide program images.

Figure 5-4. Mapping a Program to a Logical Space

graphics/05fig04.gif

The mapping of individual program object modules is accomplished by the infamous linking-loader. This program is usually coupled with the compiler and must understand which modules will be needed, and where to find them. Frequently, the build of the runnable image is controlled by the make utility and a corresponding Makefile.

Prior to the final linking of the executable image, the compiler builds relocatable object modules and a skeletal symbol table. The linking-loader must make sure that following the link there are no undefined external references to procedures not linked into the image (or to be made available via shared libraries at runtime).

The linking-loader must understand the system architecture for which it is building the executable image. Each hardware platform has its own unique way of dealing with the assignment of sections of virtual memory. In effect, the program's objects, or code and data blocks, are separated according to their function and type, and level of security they required.

Most modern compilers divide the text (instruction code) of a program into one object and program data into one or more additional objects. HP PA-RISC systems follow certain defaults for the setting of access rights and permissions depending upon which quadrant a page frame is mapped into.

Narrow executables and wide executables map objects in a different manner. This change was necessitated by the desire to allow narrow and wide processes running on the same machine to share data objects. To accomplish this goal, it was necessary for wide processes to have quadrant Q1 be allocated for shared objects (after all, the entire logical space of a narrow process fits easily into the first.01% of a wide process's Q1).

For a narrow application, the defaults are as follows:

Q1 is shared read-only access,
Q2 is private read-write,
Q3 is shared read-write, and
Q4 is shared read-write.

For a wide application,

Q1 is shared read-write access,
Q2 is shared read-only,
Q3 is private read-write, and
Q4 is shared read-write.

As we mentioned, the compilers must be aware of these rules, but as with all rules, there may be exceptions!

You may have noted that because of the implied restrictions controlling which program objects may be placed in which quadrants, there are systemwide limits to how large a program memory object may be. In general, a single object may not be larger than the quadrant to which it is mapped, and in many cases it must share the space with other objects. On a 64-bit system, this is not currently much of a concern, considering its current 4-TB hardware-implemented quadrant size, but for 32-bit applications with a 1 GB quadrant size, this has become a serious limiting factor to some applications.

The size of a process's private data quadrant may be a concern, and in some cases there are only 1 3/4 quadrants available for shared objects on the system, which has been a limiting factor to the applications we wish to run (remember that in Q4 on a narrow memory map, the last 256 MB are reserved for memory-mapped I/O). There are several tricks of the trade that may be implemented to address some of the limits. Let's make some magic!

`SHARE_MAGIC`, `EXEC_MAGIC`, and `SHMEM_MAGIC`

The first compiler directives we consider are SHARE_MAGIC, EXEC_MAGIC. The first is the default for HP-UX compilers, and the second allows a program to increase the amount of private memory space it may use. Figure 5-5 gives us a visual comparison of the logical mapping for these options.

Figure 5-5. Magic for 32-Bit Applications

graphics/05fig05.gif

These options are only for use by narrow programs (running on either a 32-bit or a 64-bit kernel and hardware platform).

`SHARE_MAGIC`

The SHARE_MAGIC directive is the system default and requires no special parameter be passed to the compiler or linking-loader. The term share as used here refers to the fact the executable image should be built with the code and data separated into different objects. This is necessary if you want to allow multiple copies of the same program to be resident on the system and have them share access to a common text object, reducing overall system memory utilization. This is often called a multiprocess shared-code access model and is implemented on most modern operating systems.

In this model, the various memory objects are assigned to the quadrants according to the default access settings. A program's data and possibly a null-dereference page are placed in Q1.

Q2 gets, starting from the lowest offset, the process's private data, comprised of the program's initialized data, uninitialized data (or BSS, an old mainframe term: Block Store by Symbol), and the beginning of the process's heap space (this is where additional data storage may be allocated to a running process using calls such as malloc()). The uarea is located near the highest offset into Q2, and working back toward the private data area, we have the process's stack, which contains a "red zone" and room for the process's private stack area. Next, we find any private memory mapped objects. The heap grows up, and the mmaps grow down until there is no space left. The red zone we mentioned is a type of buffer area at the high-address end of the stack; this address space is off limits to the process. In the case of a stack overflow, the kernel must kill the process, and the red zone space is used by the signal code during the kill procedure.

Several tunable kernel parameters control the size of objects located in Q1 and Q2.

maxtsiz limits a 32-bit process's text object; defaults to 16384 pages.
maxtsiz_64bit limits a 64-bit process's text object; defaults to 16384 pages.
maxdsiz limits a 32-bit process's private data object; defaults to 65536 pages.
maxdsiz_64bit limits a 64-bit process's private data object; defaults to 65536 pages.
maxssiz limits a 32-bit process's stack object; defaults to 2048 pages.
maxssiz_64bit limits a 64-bit process's stack object; defaults to 2048 pages.

As you can see, these default values are fairly conservative. The reason for setting these limits along with other tunables is to give the kernel a best guesstimate as to the amount of space it will need to allocate for its internal data tables. It can always allocate more pages as needed, but it is nice to start with a reasonable amount at system boot.

The Q4 and Q3 quadrants are used for shared library text, shared memory, shared memory-mapped files, and other shared objects. The Q4 space is allocated first, then Q3 (this order was established with the release HP-UX 10.0; in previous versions Q3 was used first, then Q4).

Alright, you can't fit your process data in the single quadrant allocated, then perhaps you need a little executive magic!

`EXEC_MAGIC`

The EXEC_MAGIC directive changes things a bit. The first thing we notice is that the process's private data starts immediately after the end of its text. In most cases, a program does not need an entire quadrant for its text object (and I certainly wouldn't want to be around when you tried to print out the source code for a program that produced a gigabyte of machine code, or when you tried to debug it!). As a result, most processes have a considerable amount of unused space in their first quadrant. The EXEC_MAGIC directive allows a process to make use of this space. In general, this increases the amount of private data space to approximately 1.9 GB(depending on the size of the text object, stack, and uarea). An interesting side effect of this option is that the process's text is now writable; if you want to try your hand at self-modifying code (definitely not for the casual programmer!), this is one method to explore.

To solve this dilemma, Hewlett-Packard introduced the practice of virtual address aliasing with HP-UX 10.0. Physical text page frames of an EXEC_MAGIC process are mapped to two unique virtual addresses by the kernel. This allows multiple instances of the same program to share access to common physical page frames. Because these pages are mapped to separate "private" virtual quadrants, there must be a means to avoid corruption if one of the process threads tries to write to the page frame. This issue is handled with a technique called read-only aliasing. If any of the processes sharing the physical page frame attempts a write, then a kernel mechanism called copy-on-write simply creates a duplicate copy of the physical page frame and relinks the writer's virtual page to it. In this manner, as long as you are only reading from the page, data corruption is avoided and we can enjoy the advantage of sharing text pages, thus reducing overall memory pressure (always a good thing!). There is a price to pay for this action. We are now placing private data in Q1, and we will need to change access permissions to private-read-write. Prior to HP-UX 10.0, this meant that programs compiled with this directive were not allowed to run in a shared text mode: two instances of the same program required two copies of the text to be placed in memory.

In the case of EXEC_MAGIC, the four quadrants could be mapped to four different spaces (their access controlled by the four space registers, sr4 through sr7). For an EXEC_MAGIC process, Q1 and Q2 must reside in the same virtual space (sr4 and sr5 will have the same value), allowing the data object to cross the Q1/Q2 boundary with no discontinuity. Quadrants Q3 and Q4 may each be mapped to different spaces.

What if your size problem isn't with the private data space; instead, you simply need more shareable memory. SHMEM_MAGIC may be just the trick you are looking for.

`SHMEM_MAGIC`

The SHMEM_MAGIC directive is very similar to the EXEC_MAGIC in that it combines text and data objects in the process's Q1. The big difference is that the Q2 does not contain private data but instead has been turned over for use for additional shared objects. It would do very little good to run only one process compiled with this option after all, who could it share this new space with?

Again noting Figure 5-5, we see that with this option all four quadrants may be mapped to different virtual spaces. In general this option is used in conjunction with another relatively new option, Memory Windows. We examine memory windows in just a moment.

More Magic

In addition to the three magic directives we have mentioned, there are several others. Compilers designed for HP-UX also understand DEMAND_MAGIC and SHL_MAGIC directives. Both of these are enabled by default and deal with the issue of demand page loading and the use of shared libraries located and linked at runtime. We also have RELOC_MAGIC for loading a program in relocatable-only mode, AR_MAGIC for linking in archive libraries, and DL_MAGIC for the inclusion of dynamic libraries. This concludes our magic lesson for now. Let's move on to memory windows.