Bryan Cantrill's foreword describes operating systems as "proprietary black boxes, welded shut to even the merely curious." Bryan paints a realistic view of the not-too-distant past when only a small amount of the software stack was visible or observable. Complexity faced those attempting to understand why a system wasn't meeting its prescribed service-level and response-time goals. The problem was that the performance analyst had to work with only a small set of hardwired performance statistics, which, ironically, were chosen some decades ago by kernel developers as a means to debug the kernel's implementation. As a result, performance measurement and diagnosis became an art of inferencing and, in some cases, guessing. Today, Solaris has a rich set of observability facilities, aimed at the administrator, application developer, and operating systems developer. These facilities are built on a flexible observability framework and, as a result, are highly customizable. You can liken this to the Tivo[1] revolution that transformed television viewing: Rather than being locked into a fixed set of program schedules, viewers can now watch what they want, when they want; in other words, Tivo put the viewer in control instead of the program provider. In a similar way, the Solaris observability tools can be targeted at specific problems, converging on what's important to solve each particular problem quickly and concisely.
In Part One we describe the methods we typically use for measuring system utilization and diagnosing performance problems. In Part Two we introduce the frameworks upon which these methods build. In Part Three we discuss the facilities for debugging within Solaris. This chapter previews the material explored in more detail in subsequent chapters. |