Section 4.2. System V IPC Resource Controls


4.2. System V IPC Resource Controls

Traditionally, the behavior of the System V IPC facilities (shared memory, message queues, and semaphores) was influenced through a large set of /etc/system tuneables. While some of the tuneables allowed you to set meaningful administrative limits (for example, maximum shared memory segment size), many simply exposed implementation details (for example, the number of undo entries in an undo structure). There were many limitations with the traditional implementation:

  • Relying on /etc/system as an administrative mechanism meant that reconfiguration required a reboot.

  • Many parameters were used to size data structures allocated at boot (or module load) time. There was a penalty for sizing the parameters larger than was needed.

    There were a large variety of parameters to change, many of which were implementation specific and didn't align well with public interface boundaries. Yet they were necessary to configure the system for different workloads.

  • The tuneables, named by combining a three-character facility abbreviation with a three-character parameter abbreviation, were a veritable alphabet soup. It was very easy for an administrator to misconfigure the system (see 4381822).

  • The algorithms used by the traditional implementation assumed statically sized data structures. Changing many of the tuneables at runtime wouldn't have been possible, even if an interface were available to let you do so.

  • There was no way to allocate additional resources to one user without allowing all users those resources. Since the amount of resources was always fixed, one user could have trivially prevented another from performing its desired allocations.

  • There was no good way to observe the values of the parameters.

  • Additionally, a perpetual complaint was that the default values for these tuneables were too small.

4.2.1. The Solution

In Solaris 10, we removed these limitations by reworking much of the System V IPC implementation to not require as much administrative hand-holding (removing unnecessary tuneables), and by using task-based resource controls to limit users' access to the System V IPC facilities (replacing the remaining tuneables). At the same time, we raised the default values for those limits that remained to more reasonable values. Last, for compatibility, the legacy tuneables are interpreted and used to initialize the default privileged limit for the new resource controls. The new resource controls are shown in Table 4.2.

Table 4.2. New Resource Controls

Resource Control

Similar Tuneable

Old Default

New Default

Max Value

project.max-shm-ids

shminfo_shmmni

100

128

1<<24

project.max-msg-ids

msginfo_msgmni

50

128

1<<24

project.max-sem-ids

seminfo_semmni

10

128

1<<24

project.max-shm-memory

shminfo_shmmax

512k

1/4 physical

UINT64_MAX

process.max-sem-nsems

seminfo_semmsl

25

512

SHRT_MAX

process.max-sem-ops

seminfo_semopm

10

512

INT_MAX

process.max-msg-qbytes

msginfo_msgmnb

4096

65536

ULONG_MAX

process.max-msg-messages

msginfo_msgtql

40

8192

UINT_MAX


The following tuneables no longer have any effect: semsys:seminfo_semmns, semsys:seminfo_semvmx, semsys:seminfo_semmnu, semsys:seminfo_semaem, semsys:seminfo_semume, semsys:seminfo_semusz, semsys:seminfo_semmap, shmsys:shminfo_shmseg, shmsys:shminfo_shmmin, msgsys:msginfo_msgmap, msgsys:msginfo_msgseg, msgsys:msginfo_msgssz, and msgsys:msginfo_msgmax.

The specific improvements are these:

  • It is now possible to limit use of the System V IPC facilities on a per-process or per-project basis (depending on the resource being limited) without rebooting the system.

  • None of these limits affect allocation directly; they can be made as large as possible without any immediate effect on the system. (Note that doing so would allow a user to allocate resources without bound, which would have an effect on the system.)

  • Implementation internals are no longer exposed to the administrator, greatly simplifying configuration.

  • The resource controls are fewer and are more verbosely and intuitively named than the tuneables.

  • Limit settings can be observed with the common resource control interfaces, such as prctl(1) and geTRctl(2).

  • Shared memory is limited in accordance with the total amount allocated per project, not a per-segment limit. This means that an administrator can permit a user to allocate a lot of segments and large segments, without having to permit the user to create a lot of large segments.

Because resource controls are the administrative mechanism, this configuration can be persistent across reboots by use of project(4), as well as through a network service. See Chapter 7 for more information on how to set resource controls.

The following major implementation changes were made (for all the details, see the changes made to os/ipc.c, os/msg.c, os/shm.c, syscall/sem.c):

  • Message headers are allocated dynamically. Previously, all message headers were allocated at module load time, linked into a global freelist, and allocated from there. (The locking on this list also caused a scalability problem.)

  • Semaphore arrays are allocated dynamically. Previously semaphore arrays were allocated from a seminfo_semmns-sized vmem arena, which meant that allocations could fail because of fragmentation.

  • Semaphore undo structures are allocated dynamically, and are per-process and per-semaphore array. They are unlimited in number and are always as large as the semaphore array they correspond to. Previously, the number of per-process undo structures was limited and allocated at module load time. Furthermore, the undo structures each had the same fixed size. It was possible for a process to not be able to allocate an undo structure or for the process's undo structure to be full.

  • Semaphore undo structures maintain their undo values as signed integers, so no semaphore value is too large to be undone.

  • All facilities formerly allocated objects from a fixed-size namespace, allocated at module load time. All facility namespaces are now resizable and will grow as demand increases.




SolarisT Internals. Solaris 10 and OpenSolaris Kernel Architecture
Solaris Internals: Solaris 10 and OpenSolaris Kernel Architecture (2nd Edition)
ISBN: 0131482092
EAN: 2147483647
Year: 2004
Pages: 244

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net