4.7. POSIX IPCThe evolution of the POSIX standard and associated application programming interfaces (APIs) resulted in a set of industry-standard interfaces that provide the same types of facilities as the System V IPC set: shared memory, semaphores, and message queues. They are quite similar in form and function to their System V equivalents but very different in implementation. The POSIX implementation of all three IPC facilities is built in userland libraries on top of existing IPC facilities. It uses the notion of POSIX IPC names, which essentially look like file names but need not be actual files in a file system. This POSIX name convention provides the necessary abstraction, a file descriptor, to use the Solaris file memory mapping interface, mmap(2), on which all the POSIX IPC mechanisms are built. This is very different from the System V IPC functions, for which a key value was required to fetch the proper identifier of the desired IPC resource. In System V IPC, a common method used for generating key values was the ftok(3C) (file-to-key) function, whereby a key value was generated, based on the path name of a file. POSIX eliminates the use of the key, and processes acquire the desired resource by using a file name convention. No kernel tuneable parameters are required (or available) for the POSIX IPC code. The per-process limits of the number of open files and memory address space are the only potentially limiting factors in POSIX IPC. Table 4.8 lists the POSIX APIs for the three IPC facilities.
All the POSIX IPC functions are either directly or indirectly based on memory mapped files. The message queue and semaphore functions make direct calls to mmap(2), creating a memory mapped file, based on the file descriptor returned from the xx_open(3R) call. Using POSIX shared memory requires the programmer to make the mmap(2) call explicitly from the application code. The details of mmap(2) and memory mapped files are covered in subsequent chapters, but, briefly, the mmap(2) system call maps a file or some other named object into a process's address space, as shown in Figure 4.3. The address space mapping created by mmap(2) can be private or shared. It is the shared mapping capability that the POSIX IPC implementation relies on. Figure 4.3. Process Address Space with mmap(2)![]() 4.7.1. POSIX Shared MemoryThe POSIX shared memory interfaces provide an API for support of the POSIX IPC name abstraction. The interfaces shm_open(3R) and shm_unlink(3R) do not allocate or map memory into a calling process's address space. The programmer using POSIX shared memory must create the address space mapping with an explicit call to mmap(2). Different processes that must access the same shared segment can execute shm_open(2) on the same object, for example, shm_open("seg1",...,), and then execute mmap(2) on the file descriptor returned from shm_open(3R). Any writes to the shared segment are directed to an underlying file and thus made visible to processes that run mmap(2) on the same file descriptor or, in this case, POSIX object name. Under the covers, the shm_open(3R) call invokes open() to open the named object (file). shm_unlink(3R) also uses the unlink(2) system call to remove the directory entry. That is, the file (object) is removed. 4.7.2. POSIX SemaphoresThe POSIX specification provides for two types of semaphores that can be used for the same purposes as System V semaphores but that are implemented differently. POSIX named semaphores follow the POSIX IPC name convention discussed earlier and are created with the sem_open(3R) call. POSIX also defines unnamed semaphores, which do not have a name in the file system space and are memory based. Additionally, a set of semaphore interfaces that are part of the Solaris threads library provides the same level of functionality as POSIX unnamed semaphores but uses a different API. Table 4.9 lists the different semaphore interfaces that currently ship with Solaris.
Note the common functions for named and unnamed POSIX semaphores: The actual semaphore operationssem_wait(3R), sem_trywait(3R), sem_post(3R) and sem_getvalue(3R)are used for both types of semaphores. The creation and destruction interfaces are different. The Solaris implementation of the POSIX sem_init(3R), sem_destroy(3R), sem_wait(3R), sem_trywait(3R), and sem_post(3R) functions actually invokes the Solaris threads library functions of the same name through a jump-table mechanism in the Solaris POSIX library. The jump table is a data structure that contains function pointers to semaphore routines in the Solaris threads library, libthread.so.1. The use of POSIX named semaphores begins with a call to sem_open(3R), which returns a pointer to an object defined in the /usr/include/semaphore.h header file, sem_t. The sem_t structure defines what a POSIX semaphore looks like, and subsequent semaphore operations reference the sem_t object. The fields in the sem_t structure include a count (sem_count), a semaphore type (sem_type), and magic number (sem_magic). sem_count reflects the actual semaphore value. sem_type defines the scope or visibility of the semaphore, either USYNC_THREAD, which means the semaphore is visible only to other threads in the same process, or USYNC_PROCESS, which means the semaphore is visible to other processes running on the same system. sem_magic is simply a value that uniquely identifies the synchronization object type as a semaphore rather than a condition variable, mutex lock, or reader/writer lock (see /usr/include/synch.h). Semaphores within the same process are maintained by the POSIX library code on a linked list of semaddr structures. The structure fields and linkage are illustrated in Figure 4.4. Figure 4.4. POSIX Named Semaphores![]() The linked list exists within the process's address space, not in the kernel. semheadp points to the first semaddr structure on the list, and sad_next provides the pointer for support of a singly linked list. The character array sad_name[] holds the object name (file name), sad_addr points to the actual semaphore, and sad_inode contains the inode number of the file that was passed in the sem_open(3R) call. Here is the sequence of events.
The POSIX semaphore code uses the /tmp file system for the creation and storage of the files that the code memory maps according to the name argument passed in the sem_open(3R) call. For each semaphore, a lock file and a data file are created in /tmp, with the file name prefix of .SEML for the lock file, and .SEMD for the data file. The full file name is prefix plus the strings passed as an argument to sem_open(3R), without the leading slash character. For example, if a sem_open(3R) call was issued with "/sem1" and the first argument, the resulting file names in /tmp would be .SEMLsem1 and .SEMDsem1. This file name convention is used in the message queue code as well, as we'll see shortly. If a new semaphore is being created, the following events occur.
The sem_t structure contains two additional fields not shown in the diagram. In semaphore.h, they are initialized as extra space in the structure (padding). The space stores a mutex lock and condition variable used by the library code to synchronize access to the semaphore and to manage blocking on a semaphore that's not available to a calling thread. The remaining semaphore operations follow the expected, documented behavior for using semaphores in code.
4.7.3. POSIX Message QueuesPOSIX message queues are constructed on a linked list built by the internal libposix4 library code. Several data structures are defined in the implementation, as shown in Figure 4.5. We opted not to show every member of the message queue structure, in the interests of space and readability. Figure 4.5. POSIX Message Queue Structures![]() The essential interfaces for using message queues are mq_open(3R) which opens, or creates and opens, a queue, making it available to the calling process, mq_send(3R) and mq_receive(3R) for sending and receiving messages. Other interfaces (see Table 4.8) manage queues and set attributes, but our discussion focusses on the message queue infrastructure, built on the open, send, and receive functions. A POSIX message queue is described by a message queue header, a data structure created and initialized when the message queue is first created. The message queue header contains information on the queue, such as the total size in bytes (mq_totsize), maximum size of each message (mq_maxsz), maximum number of messages allowed on the queue (mq_maxmsq), current number of messages (mq_current), current number of threads waiting to receive messages (mq_waiters), and the current maximum message priority (mq_curmaxprio). Some attributes are tuneable with mq_setattr(3R). The library code sets default values of 128 for the maximum number of messages, 1024 for the maximum size of a single message, and 32 for maximum number of message priorities. If necessary, you can increase the message size and number of messages by using msg_setattr(3R), or you can increase them initially when the queue is created, by populating an attributes structure and passing it on the mq_open(3R) call. The message pointers, mq_headpp and mq_tailpp, in the header do not point directly to the messages on the linked list. That is, they do not contain the address of the message headers. Since the shared mapping can result in the different processes referencing the message queue so that each has a different virtual address within their address space for the mapping, mq_headpp and mq_tailpp are implemented as offsets into the shared region. A message descriptor maintains additional information about the queue, such as the file permission flags (read-only or read/write) and the magic number identifying the type of POSIX named object. A second structure (mq_dn) maintains per-process flags on the message, allowing different processes to specify either blocking or nonblocking behavior on the message queue files. This is analogous to regular file flags, for which a file descriptor for an open file is maintained at the process level, and different processes can have different flags set on the same file. (For example, one process could have the file opened for read/write and another process could have the same file opened read-only.) With the big picture in place (Figure 4.5), let's look at what happens when a message queue is created and opened.
POSIX message queues offer an interesting feature that is not available with System V message queues: automatic notification to a process or thread when a message has been added to a queue. An mq_notify(3R) interface can be issued by a process that needs to be notified of the arrival of a signal. To continue with the sequence for the next code segment:
For receiving messages
Our description omits some subtle details, mostly around the priority mechanism available for POSIX message queues. A message priority can be specified in the mq_send(3R) and mq_receive(3R) calls. Messages with better priorities (larger numeric values) are inserted into the queue before messages of lower priority, so higher-priority messages are kept at the front of the queue and are removed first. |