Chapter 2: Tru64 UNIX TruCluster Server Overview


Before diving into the world of designing and configuring a Tru64 UNIX cluster, a brief overview of the Tru64 UNIX operating system will help set the table. Consider this chapter the appetizer, and the following chapters as the main meal. To enable you to digest the main topics in this book, we will prepare you by discussing several concepts and features of Tru64 UNIX and the TruCluster Server software that is now part of the operating system.

2.1 Tru64 UNIX Overview

The name of this operating system speaks volumes. First, it is a UNIX-based operating system. It falls towards the middle of the UNIX family tree because it draws some of its characteristics from both the BSD[1] and System V[2] sides of the family. It also has a healthy dose of core code created by Compaq engineers.

Figure 2-1 depicts several of the common UNIX variants. Note Tru64 UNIX at the bottom of the diagram.

click to expand
Figure 2-1: Tru64 UNIX History

Second, Tru64 Unix is truly a 64-bit operating system. The virtual addresses used in the system are indeed 64 bits, providing a huge virtual address space and supporting large file offsets and sizes. So, is that it? Are those the distinguishing features of the system? There are actually many more features of the operating system which we will visit in the first part of this chapter. If you are familiar with another UNIX system, you may want to take a quick look at the first section of this chapter but plan on slowing down and carefully reading the TruCluster Server Overview (section 2.7).

2.1.1 Operating System Features

Tru64 UNIX has rapidly expanded its capabilities to the point where it provides the ability to support a Single System Image (SSI) Cluster option (as discussed in the previous chapter). Tru64 UNIX has been an integral part of the computer mix at many sites for many years – even before full-bodied clustering (SSI) was available.

Which features attracted customers to Tru64 UNIX before the advent of clustering? As you will see, there are many. We'll point out some features in the next few sections and relate those features to TruCluster Server (the focus of this book).

2.1.2 Mach Kernel

Lurking at the very heart of Tru64 UNIX are elements of the Mach kernel. Mach is a system created at Carnegie-Mellon University. It includes the notions of tasks and threads that figure prominently within the workings of Tru64 UNIX. A "task" represents a running program, while a ‘thread’ is a schedulable entity within that program. Historically, programs were written with a single thread. Most of the advanced UNIX variants provide the ability to create multi-threaded programs such that better advantage can be taken of multiple CPU systems.

The Mach kernel is also touted as a "Microkernel" (a small, compartmentalized kernel supported by many kernel mode threads and user mode processes), despite the fact that earlier releases, such as the V2.5 upon which Tru64 UNIX was built, were actually monolithic in nature. While not central to the function of TruCluster Server, this notion is important to the ongoing development of TruCluster Server and Tru64 UNIX in general. Essentially, key alterations in the system kernel can be implemented much more rapidly with a microkernel (or even a pseudo-microkernel) since the subsystems are very well defined and somewhat isolated from one another. Note that Tru64 UNIX is not strictly using the microkernel strategy but borrows heavily from it (we'd love to say the subsystems are completely distinct from one another, but that's just not true). This provides us with a flexible software product (Tru64 UNIX and TruCluster Server) to which new features can be added relatively quickly. Indeed, TruCluster Server itself is an example of rapid adaptation of the operating system to include new features and subsystems.

Some cluster components are implemented as kernel threads. Others are implemented as process-based code consisting of one or more threads. Still others are subsystems within the kernel. These components ultimately rely on system functions partially derived from Mach. Many other cluster components are implemented as driver-level kernel code. The following sections will briefly develop and introduce many of the key system components and cluster components. All cluster components will be discussed in subsequent chapters of the book.

2.1.3 Virtual Memory

The system uses virtual addresses, which are translated into physical addresses to provide access to data and code in memory (or I/O space). The previous sentence could be used to describe just about any modern operating system. Tru64 UNIX has solved the problem of representing a virtual address space consisting of 264 bytes of potential addressability (most other UNIX variants are years behind Compaq, now HP, in developing 64-bit systems). It does this using a clever three-level page table scheme that we don't need to detail here. The point is that it is a key feature of the system and is used heavily by all components including the TruCluster Server components.

2.1.4 Unified Buffer Cache

The Unified Buffer Cache (UBC) is an innovation through which Tru64 UNIX can tune itself, at least partially. The memory caching needs of the file systems tend to be in direct conflict with the memory needs of processes. If the system is experiencing a burst of I/O activity, the file system caching memory count (generally referred to as the UBC page count) will increase. If the virtual memory requests from processes become heavy, the pages are taken back from the UBC and used for process memory. And so the pendulum can swing back and forth throughout the life of your system without your lifting a finger. Pretty impressive, huh?

To be fair, Tru64 UNIX is not the only UNIX that uses this strategy.

As you will see, the UBC is used by several of the I/O components that make clustering possible.

2.1.5 Shared Libraries

Shared libraries provide for the sharing of code at the function level. UNIX has always been good at sharing code at the process level, meaning that two users who both happen to be running the vi(1)editor at the same time, for example, will be sharing the single copy of the vi code that is brought into memory. But UNIX has traditionally been weak at sharing at the function level. So if one process were running the vi editor and the other were running emacs(1) (we'll assume that these two editors use many of the same functions), traditional UNIX would have brought two copies of the potentially shared functions into memory.

Shared libraries provide a mechanism where any program that uses shared library functions (think printf(3)), will reference the single copy of the function code that has been brought into memory. Note that the system is an ‘on demand’ system, so none of the shared functions are in memory until the first request causes one to be brought in. Likewise, as soon as there are no users of the function, the memory that it occupies will be freed.

The process-level TruCluster Server code is linked against shared libraries. The following example shows that the Cluster Application Availability Daemon (caad(8)) is linked against shared libraries. We then document which shared libraries are referenced within the caad process.

 # file /usr/sbin/caad /usr/sbin/caad: COFF format alpha dynamically linked, demand paged executable or object module stripped - version 3.13-14 

 # odump -Dl /usr/sbin/caad                         ***LIBRARY LIST SECTION***         Name             Time-Stamp       CheckSum    Flags Version /usr/sbin/caad:         libpolicy.so Jan 16 04:39:17 2002 0x7958cdb5    0 osf.1         libevm.so    Jan 15 17:37:17 2002 0xde4a5d09    0 osf.1         libclu.so    Jan 15 17:36:05 2002 0xd148a817    0 osf.1         libm.so      Jan 15 17:20:50 2002 0x07757304    0 osf.1         libpthread.so Jan 15 17:26:48 2002 0x42a00c94   0 osf.1         libcxx.so    Jan 15 17:29:14 2002 0x9060972e    0 cxx6.3         libexc.so    Jan 15 17:20:58 2002 0xb0f9a902    0 osf.1         libc.so      Jan 15 17:19:09 2002 0x1e4e245f    0 osf.1 

The following command output lists the location and some of the shared library files available in Tru64 UNIX.

 # ls /usr/shlib .mrg..so_locations      libarmui.so        libmsfs.so .new..so_locations      libaud.so          libmxr.so .proto..so_locations    libawt.so          libndb.so TCR_libclu.so           libawt_g.so        libnet.so X11                     libbkr.so          libnet_g.so _null                   libc.so            libnuma.so diagui__unix.uid        libc_r.so          libots.so ev6                     libcdrom.so        libots3.so generic                 libcfg.so          libpacl.so libDSNLinkAPI.so        libchf.so          libpolicy.so libDXm.so               libclu.so          libproplist.so libDXterm.so            libclua.so         libpset.so libDeCOR.so             libcmalib.so       libpthread.so libDtHelp.so            libcsa.so          libpthreaddebug.so libDtMail.so            libcurses.so       libpthreads.so ... 

2.1.6 Memory Wiring

Most of the physical memory on Tru64 UNIX is pageable. This means that the contents of the memory pages may be paged out to swap space (on disk), or swapped out to swap space if the system's free page list becomes critically low. Certain applications may require that portions of its memory be treated as if it were non-pageable. This activity (referred to as "wiring down" a page) is limited to processes that are owned by root. The kernel may also wire down pageable pages to meet its dynamic memory requirements.

The following example concludes with a section displaying statistics on the wired pages within the system.

 # vmstat -P Total Physical Memory =    128.00 M                       =    16384 pages Physical Memory Clusters: start_pfn    end_pfn     type     size_pages / size_bytes         0         256      pal            256 /    2.00M       256       16287       os          16031 /  125.24M     16287       16384      pal             97 /  776.00k Physical Memory Use: start_pfn    end_pfn         type    size_pages / size_bytes        256       288     scavenge            32 /    256.00k        288      1036         text           748 /      5.84M       1036      1180         data           144 /      1.12M       1180      1400         bss            220 /      1.72M       1400      1594      kdebug            194 /      1.52M       1594      1600     cfgmgmt              6 /     48.00k       1600      1601       locks              1 /      8.00k       1601      1615        pmap             14 /    112.00k       1615      1811   unixtable            196 /      1.53M       1811      1814        logs              3 /     24.00k       1814      2046    vmtables            232 /      1.81M       2046     16287     managed          14241 /    111.26M                            =================================         Total Physical Memory Use:        16031 /    125.24M Managed Pages Break Down:              free pages = 582            active pages = 1901          inactive pages = 4839             wired pages = 3869               ubc pages = 3082             ==================                 Total  = 14273 WIRED Pages Break Down:           vm wired pages = 705          ubc wired pages = 0          meta data pages = 1467             malloc pages = 995             contig pages = 88            user ptepages = 585          kernel ptepages = 21            free ptepages = 8           ==================                 Total = 3869 

2.1.7 Non-Uniform Memory Access (NUMA)

Another example of the rapidly changing nature of Tru64 UNIX is the inclusion of support for Non-Uniform Memory Access (NUMA) systems. Traditional Symmetric Multiprocessing (SMP) systems do not scale well as more processors are added. The Compaq (now HP) GS-series of computers (GS80, GS160, and GS320) can handle 8, 16, and 32 CPUs respectively (more in the future) in a manner yielding excellent scalability. This requires specialized hardware, but the operating system software is, once again, nothing more than good, old Tru64 UNIX. While some folks refer to the GS-series as a "cluster in a box," that is definitely not the intent of these machines (and certainly nothing that we recommend), although the hardware will support it.

The next sequence will take you through the conceptual developments in the world of computers that led to the notion of clusters. Along the way, several of the Tru64 UNIX features will be mentioned.

[1]Berkeley Standard Distribution (BSD), developed at the University of California at Berkeley.

[2]UNIX System V, developed by the UNIX System Development Lab at AT&T.




TruCluster Server Handbook
TruCluster Server Handbook (HP Technologies)
ISBN: 1555582591
EAN: 2147483647
Year: 2005
Pages: 273

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net