Section 2.2. Process Model Evolution | Solaris Internals: Solaris 10 and OpenSolaris Kernel Architecture (2nd Edition)

2.2. Process Model Evolution

The multithreaded process model in Solaris underwent significant change in Solaris 10, but the evolution actually began in Solaris 8, with the introduction of an alternative threads library (/usr/lib/lwp/libthread.so). In Solaris 9, the new threads library became the default library for multithreaded applications. Additional changes were made in Solaris 10 with the Process Model Unification project, which integrated the threads library (libthread.so) into the standard C library (libc.so), creating a single process model for all processes in Solaris. In Solaris 10, threaded and nonthreaded processes have the same process objects and components. It is important to note that these changes, while involving significant work at the library and kernel level, are not visible to users and developers. Source and binary compatibility was maintained; running, developing, compiling, and using threaded applications in Solaris 10 is consistent with previous releases. The only difference worth noting is a simplification of writing and maintaining threaded applications in two specific areas: signal management and concurrency management. The complexity of maintaining and debugging threaded applications has been eased with the new model as well, since it is inherently much less complex.

2.2.1. Thread Model Evolution

The thread model in Solaris was originally a multilevel MxN model, in which a user thread was something separate and distinct from an LWP. User threads (M) were multiplexed onto a potentially smaller number of LWPs (N) by a user thread scheduler implemented in the thread library (libthread.so). There was not a one-to-one relationship between user threads and LWPs unless the developer explicitly created bound threads, an option available with the thread_create() API, or a settable attribute with pthread_attribute_setscope() when pthread_create() was used to create threads. The original MxN model worked well for many years, but some inherent difficulties in the implementation were extremely complex to overcome.

Signal behavior. Delivery of asynchronous signals was problematic, since the thread targeted to receive the signal may not have been linked to an LWP when delivery was attempted.
Thread scheduling. The threads library implemented a scheduler (not to be confused with the kernel scheduler). The threads library scheduler managed the scheduling of unbound user threads onto LWPs and maintained a library-level priority scheme. In some applications and workloads, the latency induced by this level of scheduling resulted in performance issues. Also, the scheduling lock maintained by the library could become a point of contention, impacting scalability. The expected level of concurrency was not always met, because concurrency was governed by the number of available LWPs.
Maintaining the LWP pool. Keeping a sufficient number of LWPs available such that runnable user threads had the resources necessary to execute was a complex task performed by the threads library in the absence of specific hints from the application (the now obsolete thr_setconcurrency() API).

In addition to the issues listed above, the evolution of technology challenged some of the underlying assumptions that drove the original MxN implementation. First and foremost were processor performance and the cost of creating threads. The original model was intended to minimize the cost of creating threads by not requiring underlying kernel resources to be allocated and to facilitate the existence of hundreds or thousands of threads in a process without imposing undue overhead on the kernel. Processor performance has advanced to a point where this trade-off is no longer worth the cost in complexity in maintaining the old model. Thread creation with the new model, while more costly in absolute terms, is still a relatively fast operation.

Technology advances and issues with the original model, coupled with a desire to enhance specific features for threaded applications, led to the implementation of a new threads library. The new library was architected as a 1:1 model, in which every user thread is created with an LWP and associated kernel thread. For all intents and purposes, a user thread is an LWP in Solaris 10.

Note that the original threads model is discussed in detail in the first edition of Solaris Internals. A technical white paper, titled Multithreading in the Solaris Operating Environment and written by Sun Engineer Phil Harman, discusses the new threads library for Solaris 8 and Solaris 9 in detail. This white paper is available online at:

http://www.sun.com/software/whitepapers/solaris9/multithread.pdf

2.2.2. Unified Process Model

Before Solaris 10, two process models existed: single-threaded processes (not linked with libthread.so) and multithreaded processes (linked with libthread.so). The existence of two process models led to some fundamental problems.

Libraries (like libnsl.so) that want to create helper threads have complex code to do one thing if the application to which libnsl.so is linked is single threaded and to do another if it is multithreaded.
Some libraries use multithreading and satisfy their need for libthread.so by linking with it when the library is built. Such libraries can still be targeted for a dlopen(3C) call from an application. If a single-threaded application does call dlopen(), it becomes a multithreaded application, a condition for which it was not built and which is not necessarily expected by the developer.
libc.so has become quite complex over time. It must operate in either process model and be prepared to switch to the multithreaded model whenever the application issues a dlopen(3C) on libthread.so. This induces enormous complexity into libc.so and can lead to performance issues.
New developments in thread local storage (TLS), requiring cooperation between the compilers and the Solaris libraries, can only be accomplished in a multithreaded process model. The means that the Solaris libraries themselves cannot take advantage of TLS, since they must be prepared to operate with both single-threaded and multithreaded applications.

The list above represents the more salient issues with maintaining two process models. It was time to remove the complexity and confusion and implement a single process model in Solaris by integrating the code in libthread into libc. In Solaris 10, all thread APIs that were previously in libthread and libpthread are now in libc. libthread and libpthread are still provided with stubs for binaries that require resolving to libthread or libpthread as a result of library specifications in the build process. The actual thread's code is in libc and will ultimately be resolved from libc.

Unifying the process model required making another change in terms of the libraries shipped in Solaris: the removal of libc.a, an archive version of libc for creating statically linked binaries. The main problem with static libc and threading is that the static libc cannot contain multithreading interface functions. All multithreading interface functions require initialization before main() is called, and this initialization occurs through the init phase of dynamic linking and cannot occur with a static threading library. Thus, we cannot provide a static libc without special code in all the threads' API source files to suppress their contents when being compiled statically.

Also, statically linked programs cannot assume they are running in a multi-threaded environment. This puts a constraint on all library code (at least libraries that are compiled for both static and dynamic linking). There must be conditional statements to take care of the two different possible process models. No library code can take advantage of the newly provided compiler-supported thread local storage. Also, unsolvable problems arise with binaries that are partially statically linked and that statically link to libc.a. If a partially statically linked binary loads a shared object through a dlopen() call, the dynamic version of libc is loaded into the binary. Much, but not all, of libc is already in the binary (as a result of the static linking), and the dynamic linker will resolve calls from the libc opened by dlopen() to those libc functions already in the binary. But the libc functions already in the binary were not compiled for multithreading. The process would suddenly become multithreaded, calls would be made from the dynamic libc into the application's copies of static components of libc, and chaos would ensue.

For these reasons, it was decided to stop shipping archive versions of bundled Solaris libraries. Note that for 64-bit Solaris, 64-bit versions of bundled archive libraries were not shipped, so 64-bit applications have been dynamically linking to libc and others for several years without incident.

The effect of no longer providing a libc for static linking required some changes in the root file system organization of Solaris, since several of the binaries that reside in /sbin were statically linked. With Solaris 10, many of the shared object libraries that historically shipped in /usr/lib are now part of the root file system, in /lib. All binaries in /sbin are now dynamically linkedmoving the shared object libraries to /lib makes them available early on in the boot process, after root has been mounted but before /usr is mounted, for the binaries in /sbin that depend on them.

We can summarize as follows:

The threads model has changed significantly in Solaris 10 and OpenSolaris. With the 1:1 model, all threads are LWPs and are immediately visible to the kernel scheduler.
The process image is now consistent for threaded and nonthreaded processes. libthread and libpthread has been integrated into libctHReaded applications no longer need to link to libthread.
The changes made with the new threads library and process model unification did not impact source or binary compatibility.
Archive versions of bundled Solaris libraries are no longer available. Applications must dynamically link to libraries in /lib and /usr/lib.