Chapter 10. I/O and Data Storage
Computers are good at computing. This tautology is more profound than it appears. If we only had to sit and chew up the CPU cycles and reference RAM as needed, life would be easy.
A computer that only sits and thinks to itself is of little use, however. Sooner or later we have to get information into it and out of it, and that is where life becomes more difficult.
Several things make I/O complicated. First, input and output are rather different things, but we naturally lump them together. Second, the varieties of I/O operations (and their usages) are as diverse as species of insects.
History has seen such devices as drums, paper tapes, magnetic tapes, punched cards, and teletypes. Some operated with a mechanical component; others were purely electromagnetic. Some were read-only; others were write-only or read-write. Some writable media were erasable, and others were not. Some devices were inherently sequential; others were random access. Some media were permanent; others were transient or volatile. Some devices depended on human intervention; others did not. Some were character oriented; others were block oriented. Some block devices were fixed length; others were variable length. Some devices were polled; others were interrupt-driven. Interrupts could be implemented in hardware or software or both. We have both buffered and non-buffered I/O. We have seen memory-mapped I/O and channel-oriented I/O, and with the advent of operating systems such as UNIX, we have seen I/O devices mapped to files in a filesystem. We have done I/O in machine language, in assembly language, and in high-level languages. Some languages have the I/O capabilities firmly hard-wired in place; others leave it out of the language specification completely. We have done I/O with and without suitable device drivers or layers of abstraction.
If this seems like a confusing mess, that is because it is. Part of the complexity is inherent in the concept of input/output, part of it is the result of design trade-offs, and part of it is the result of legacies or traditions in computer science and the quirks of various languages and operating systems.
Ruby's I/O is complex because I/O in general is complex. But we have tried here to make it understandable and present a good overview of how and when to use various techniques.
The core of all Ruby I/O is the IO class, which defines behavior for every kind of input/output operation. Closely allied to IO (and inheriting from it) is the File class. There is a nested class within File called Stat, which encapsulates various details about a file that we might want to examine (such as its permissions and time stamps). The methods stat and lstat return objects of type File::Stat.
The module FileTest also has methods that allow us to test much the same set of properties. This is mixed into the File class and can also be used on its own.
Finally, there are I/O methods in the Kernel module, which is mixed into Object (the ancestor of all objects, including classes). These are the simple I/O routines that we have used all along without worrying about what their receiver was. These naturally default to standard input and standard output.
The beginner may find these classes to be a confused jumble of overlapping functionality. The good news is that you need only use small pieces of this framework at any given time.
On a higher level, Ruby offers features to make object persistence possible. The Marshal enables simple serialization of objects, and the more sophisticated PStore library is based on Marshal. We include the DBM library in the same section with these, although it is only string-based.
On the highest level of all, data access can be performed by interfacing to a separate database management system such as MySQL or Oracle. This issue is complex enough that one or more books could be devoted to these. We will provide only a brief overview to get the programmer started. In some cases, we provide only a pointer to an online archive.