Introduction | Storage Networking Fundamentals: An Introduction to Storage Devices, Subsystems, Applications, Management, and File Systems (Vol 1)

When I was a boy, my dad used to tell me, "A job is only worth doing if you do it right." I've since learned that for some projects "getting it right" involves many years, repeated attempts, and the ability to identify and throw away weak components that do not further the goals of the project.

For a couple years, I'd been feeling that I wanted to write another book that would take on more difficult topics and present them in a way that allows an experienced networking professional to gain a deeper understanding of storage technologies and processes. So when Cisco Press approached me about working on a new storage book, I was interested. Fortunately, their concept aligned with mine, which was to focus almost entirely on the storage technologies and ignore most of the networking content. I've always preferred the murky Neanderthal swamp of storage to the technology reservoirs of network standards anyway.

One of the most difficult things in discussing storage is terminology. The problem is that the terms used in storage often have a historical context that no longer fits with new architectures and environments. For example, the terms virtual disk, disk volume, and LUN are often used to refer to the same type of storage entity, but they can also refer to distinctly different things within the same expanded I/O system. These types of overlapping definitions cause no end of confusion in the industry. I knew from the outset that explaining storage better this time was going to require some different techniques, terms, and analysis.

There is a lot of literature in computer science where storage is simply referred to as "a disk drive" because of old, outdated assumptions embedded in operating systems. Historically, there had never been a reason to make a distinction between the file system's view of a collection of SCSI block addresses and the thing that an HBA device and driver software actually communicate with. However, when the downstream storage is virtualized in any number of waysas it often is in a SANit becomes very useful to clarify the pieces of storage that come into play and their respective roles.

The concept of a "storage address space" is introduced early in the book (in Chapter 2, "Establishing a Context for Understanding Storage Networks") to refer to the flat, contiguous string of addresses where data is stored by a file system or database. Now the words "storage address space" might not be a phrase that leaps off the tongue, but I can assure you that if you start using them to refer to an abstract, addressable "chunk" of storage, many difficulties interpreting storage become much, much easier. No caveats or asterisks are needed to augment the term storage address space, like there are with more familiar terms like virtual disk, disk, and volume.

Throughout the book, a perspective on the relationships between host storage software, storage subsystems, and storage devices is included. This necessarily includes the relationship between the operating system kernel and the file system. A surprising thing about operating system/file system interactions is that very few people understand what goes on at any detailed level. This is not an environment based on object orientation or structured programming devices, but instead is an environment optimized by performance with cunning algorithms and shortcuts. Once things work well in an operating system, every attempt is made not to change it. Hence, the problem with carrying old assumptions forward, well past their appropriate age.

Another key to understanding storage is knowing SCSI protocols and processes. Chapter 6, "SCSI Storage Fundamentals and SAN Adapters," will be a pivotal chapter for many readers, and I expect it will be referred to often by readers working on concepts in later chapters. I hope my explanation of SCSI architecture is at the appropriate level and works for you. Like so many things, it is not necessary to know the details of a technology if a good architectural understanding exists and provides predictable operating principles.

The concept of a "layout reference system" is used in the discussions of file systems. Again, a good general term has not existed for the logical entity that describes the location of data in a storage address space. There is simply no reason to believe that I nodes or V nodes would necessarily have to be the technology used to track stored data in other, newer file systems. My apologies to file system developers. I am not trying to change what you do; I am only trying to explain it in an accessible, generic fashion, the same as I would do if I were trying to write a requirements document for a file system.

Trying to clarify old, existing constructs by inventing new generic terms carries some amount of personal risk, because it is not at all clear that people will accept them. But for this attempt it was clear that some new language and new analysis were needed to do the job "right." I didn't see any other way to make storage easier to think about and understand.

This book is the result of a great deal of work and a lot of help from many people who graciously pitched in. From my perspective, I can honestly say that no shortcuts were taken, no assumptions went unquestioned, and no conceptual coasting was done in writing this book. I look forward to hearing from readers in the years to come.