The title of this book is The Ruby Way. This is a title that begs for a disclaimer.
It has been my aim to align this book with the philosophy of Ruby as well as I could. That has also been the aim of the other contributors. Credit for success must be shared with these others, but the blame for any mistakes must rest solely with me.
Of course, I can't presume to tell you with exactness what the spirit of Ruby is all about. That is primarily for Matz to say, and I think even he would have difficulty communicating all of it in words.
In short, The Ruby Way is only a book; but the Ruby Way is the province of the language creator and the community as a whole. This is something difficult to capture in a book.
Still I have tried in this introduction to pin down a little of the ineffable spirit of Ruby. The wise student of Ruby will not take it as totally authoritative.
Be aware that this is a second edition. Many things have stayed the same, but many things have changed. Most of this introduction has been preserved, but you will want to visit the upcoming section, "About the Second Edition," where I summarize the changes and the new material.
About the Second Edition
Everything changes, and Ruby is no exception. As I write this introduction in August 2006, the first edition of this book is nearly five years old. It is certainly time for an update.
There are many changes and much new material in this edition. The old Chapter 4 ("Simple Data Tasks") is now split into six chapters, two of which ("Ranges and Symbols" and "Internationalization in Ruby") are totally new; the other four also have new examples and commentary added to them. The coverage of regualr expressions is particularly expanded, covering not only "classic" regexes but the newer Oniguruma regex engine.
Chapters 8 and 9 were originally one chapter. This was split as material was added and the chapter grew too big.
In the same way, the current chapters 18, 19, and 20 grew out of the old chapter 9 as material was added. The appendices were deleted to make room for more material.
Other new chapters are:
In a larger sense, every single chapter in this book is "new." I have revised and updated every one of them, making hundreds of minor changes and dozens of major changes. I deleted items that were obsolete or of lesser importance; I changed material to fit changes in Ruby itself; and I added new examples and commentary to every chapter.
You may wonder what has been added to the old chapters. Some of the highlights are the Oniguruma coverage, which I already mentioned; coverage of math libraries and classes such as BigDecimal, mathn, and matrix; and new classes such as Set and DateTime.
In Chapter 10, "I/O and Data Storage," I added material on readpartial, on nonblocking I/O, and on the StringIO class. I also added material on CSV, YAML, and KirbyBase. In the database portion of that same chapter I added Oracle, SQLite, DBI, and a discussion of Object-Relational Mappers (ORMs).
Chapter 11, "OOP and Dynamic Features in Ruby," now includes more recent additions to Ruby such as initialize_copy, const_get, const_missing, and define_method. I also added coverage of delegation and forwarding techniques.
All of Chapter 12, "Graphical Interfaces for Ruby," had to be revised (especially the GTK and Fox sections). The QtRuby section is totally new.
Chapter 14, "Scripting and System Administration," now discusses the Windows one-click installer and a few similar packages. It also has several improvements in the example code.
Chapter 18, "Network Programming," now has a section on email attachments and another new section on interacting with an IMAP server. It also has coverage of the OpenURI library.
Chapter 19, "Ruby and Web Applications," now covers Ruby on Rails, Nitro, Wee, IOWA, and other web tools. It also has coverage of WEBrick and some coverage of Mongrel.
Chapter 20, "Distributed Ruby," has new material explaining Rinda, the Ruby tuplespace implementation. It also covers Ring, which is closely related.
Were all these additions necessary? I assure you they were.
For the record, The Ruby Way was the second Ruby book in the English language (following the famous "Pickaxe," or Programming Ruby, by Dave Thomas and Andy Hunt). It was carefully designed to be complementary to that book rather than overlapping it; this is a large part of the reason for its popularity.
When I began writing the first edition, there was no international Ruby conference. There was no RubyForge, no ruby-doc.org, and no rubygarden.org wiki. In essence there was little on the Web besides the main Ruby site. The Ruby Application Archive had a few hundred entries in it.
At that time, few publications (online or off) seemed to know of Ruby's existence. Any time an article was published about Ruby, it was cause for us to take notice; it was announced on the mailing list and discussed there.
Many common Ruby tools and libraries also did not exist. There was no RDoc; there was no REXML to parse XML; the math library was considerably less rich than it is now. Database support was spotty, and there was no ODBC. Tk was by far the most used GUI toolkit. The most common way of doing web development was the low-level CGI library.
There was no "one-click" Windows installer. Windows users typically used Cygwin or a mingw-based compile.
The RubyGems system did not exist even in primitive form. Finding and installing libraries and applications was typically a completely manual process involving tar and make commands.
No one had heard of Ruby on Rails. No one (so far as I recall) had yet used the term "duck typing." There was no YAML for Ruby, and there was no Rake.
We used Ruby 1.6.4 at that time, and we thought it was pretty cool. But Ruby 1.8.5 (the version I typically use now) is even cooler.
There have been a few changes in syntax but nothing to write home about. Mostly these were "edge cases" that now make a little more sense than before. Ruby has always been slightly quirky about when it considered parentheses to be optional; 98% of the time you won't notice the difference, and when you do, hopefully it is smoother and more consistent now than it was.
The semantics of some of the core methods have changed. Again, these are mostly minor changes. For example, Dir#chdir formerly did not take a block, but in recent years it can.
Some core methods have been obsoleted or renamed. The class method has lost its alias type (because we don't usually talk about the types of objects in Ruby). The intern method is now the friendlier to_sym method; Array#indices is now Array#values_at. I could go on, but you get the idea.
There are also new core methods such as Enumerable#inject, Enumerable#zip, and IO#readpartial. The old futils library is now fileutils, and it has its own module namespace FileUtils instead of adding methods into the File class.
There have been numerous other changes as well. It's important to realize, however, that these were made with great care and caution. Ruby is still Ruby. Much of the beauty of Ruby is derived from the fact that it changes slowly and deliberately, crafted by the wisdom of Matz and the other developers.
Today we have a proliferation of books on Ruby and more articles published than we can bother to notice. Web-based tutorials and documentation resources abound.
New tools and libraries have appeared. For whatever reasons, the most common of these seem to be web frameworks and tools, blogging tools, markup tools, and object-relational mappers (ORMs). But there are many others, of coursetools and libs for databases, GUIs, number-crunching, web services, image manipulation, source control, and more.
Ruby editor support is widespread and sophisticated. IDEs are available that are useful and mature (which partly overlap with the GUI builders).
It's also undeniable that the community has grown and changed. Ruby is by no means a niche language today; it is used at NASA, NOAA, Motorola, and many other large companies and institutions. It is used for graphics work, database work, numerical analysis, web development, and more. In shortand I mean this in the positive senseRuby has gone mainstream.
Updating this book has been a labor of love. I hope it is useful to you.
How This Book Works
You probably won't learn Ruby from this book. There is relatively little in the way of introductory or tutorial information. If you are totally new to Ruby, you might want start with another book.
Having said that, programmers are a tenacious bunch, and I grant that it might be possible to learn Ruby from this book. Chapter 1, "Ruby in Review," does contain a brief introduction and some tutorial information.
Chapter 1 also contains a comprehensive "gotcha" list (which has been difficult to keep up-to-date). The usefulness of this list in Chapter 1 will vary widely from one reader to another because we cannot all agree on what is intuitive.
This book is largely intended to answer questions of the form "How do I...?" As such, you can expect to do a lot of skipping around. I'd be honored if everyone read every page from front to back, but I don't expect that. It's more my expectation that you will browse the table of contents in search of techniques you need or things you find interesting.
As it turns out, I have talked to many people since the first edition, and they did in fact read it cover to cover. What's more, I have had more than one person report to me that they did learn Ruby here. So anything is possible.
Some things this book covers may seem elementary. That is because people vary in background and experience; what is obvious to one person may not be to another. I have tried to err on the side of completeness. On the other hand, I have tried to keep the book at a reasonable size (obviously a competing goal).
This book can be viewed as a sort of "inverted reference." Rather than looking up the name of a method or a class, you will look things up by function or purpose. For example, the String class has several methods for manipulating case: capitalize, upcase, casecmp, downcase, and swapcase. In a reference work, these would quite properly be listed alphabetically, but in this book they are all listed together.
Of course, in striving for completeness, I have sometimes wandered onto the turf of the reference books. In many cases, I have tried to compensate for this by offering more unusual or diverse examples than you might find in a reference.
I have tried for a high code-to-commentary ratio. Overlooking the initial chapter, I think I've achieved this. Writers may grow chatty, but programmers always want to see the code. (If not, they should want to.)
The examples here are sometimes contrived, for which I must apologize. To illustrate a technique or principle in isolation from a real-world problem can be difficult. However, the more complex or high-level the task was, the more I attempted a real-world solution. Thus if the topic is concatenating strings, you may find an unimaginative code fragment involving "foo" and "bar", but when the topic is something like parsing XML, you will usually find a much more meaningful and realistic piece of code.
This book has two or three small quirks to which I'll confess up front. One is the avoidance of the "ugly" Perl-like global variables such as $_ and the others. These are present in Ruby, and they work fine; they are used daily by most or all Ruby programmers. But in nearly all cases their use can be avoided, and I have taken the liberty of omitting them in most of the examples.
Another quirk is that I avoid using standalone expressions when they don't have side effects. Ruby is expression-oriented, and that is a good thing; I have tried to take advantage of that feature. But in a code fragment, I prefer to not write expressions that merely return a value that is not usable. For example, the expression "abc" + "def" can illustrate string concatenation, but I would write something like str = "abc" + "def" instead. This may seem wordy to some, but it may seem more natural to you if you are a C programmer who really notices when functions are void or nonvoid (or an old-time Pascal programmer who thinks in procedures and functions).
My third quirk is that I don't like the "pound" notation to denote instance methods. Many Rubyists will think I am being verbose in saying "instance method crypt of class String" rather than saying String#crypt, but I think no one will be confused. (Actually, I am slowly being converted to this usage, as it is obvious the pound notation is not going away.)
I have tried to include "pointers" to outside resources whenever appropriate. Time and space did not allow putting everything into this book that I wanted, but I hope I have partially made up for that by telling you where to find related materials. The Ruby Application Archive on the Web is probably the foremost of these sources; you will see it referenced many times in this book.
Here at the front of the book there is usually a gratuitous reference to the type-faces used for code, and how to tell code fragments from ordinary text. But I won't insult your intelligence; you've read computer books before.
I want to point out that roughly 10 percent of this book was written by other people. That does not even include tech editing and copyediting. You should read the acknowledgements in this (and every) book. Most readers skip them. Go read them now. They're good for you, like vegetables.
About The Book's Source Code
Every significant code fragment has been collected into an archive for the reader to download. Look for this archive on the www.awprofessional.com site or at my own site (www.rubyhacker.com).
It is offered both as a .tgz file and as a .zip file. For the files in this archive, the general naming convention is that the actual code listings are named according to the listing number (for example, listing14-1.rb). The shorter code fragments are named according to page number and an optional letter (for example, p260a.rb and p260b.rb). Code fragments that are very short or can't be run "out of context" will usually not appear in the archive.
What Is the "Ruby Way"?
What do we mean by the Ruby Way? My belief is that there are two related aspects: One is the philosophy of the design of Ruby; the other is the philosophy of its usage. It is natural that design and use should be interrelated, whether in software or hardware; why else should there be such a field as ergonomics? If I build a device and put a handle on it, it is because I expect someone to grab that handle.
Ruby has a nameless quality that makes it what it is. We see that quality present in the design of the syntax and semantics of the language, but it is also present in the programs written for that interpreter. Yet as soon as we make this distinction, we blur it.
Clearly Ruby is not just a tool for creating software, but it is a piece of software in its own right. Why should the workings of Ruby programs follow laws different from those that guide the workings of the interpreter? After all, Ruby is highly dynamic and extensible. There might be reasons that the two levels should differ here and there, probably for accommodating to the inconvenience of the real world. But in general, the thought processes can and should be the same. Ruby could be implemented in Ruby, in true Hofstadter-like fashion, though it is not at the time of this writing.
We don't often think of the etymology of the word way; but there are two important senses in which it is used. On the one hand, it means a method or technique, but it can also mean a road or path. Obviously these two meanings are interrelated, and I think when I say "the Ruby Way," I mean both of them.
So what we are talking about is a thought process, but it is also a path that we follow. Even the greatest software guru cannot claim to have reached perfection but only to follow the path. And there may be more than one path, but here I can only talk about one.
The conventional wisdom says that form follows function. And the conventional wisdom is, of course, conventionally correct. But Frank Lloyd Wright (speaking in his own field) once said: "Form follows functionthat has been misunderstood. Form and function should be one, joined in a spiritual union."
What did Wright mean? I would say that this truth is not something you learn from a book, but from experience.
However, I would argue that Wright expressed this truth elsewhere in pieces easier to digest. He was a great proponent of simplicity, saying once, "An architect's most useful tools are an eraser at the drafting board and a wrecking bar at the site."
So one of Ruby's virtues is simplicity. Shall I quote other thinkers on the subject? According to Antoine de St. Exupery, "Perfection is achieved, not when there is nothing left to add, but when there is nothing left to take away."
But Ruby is a complex language. How can I say that it is simple?
If we understood the universe better, we might find a "law of conservation of complexity"a fact of reality that disturbs our lives like entropy so that we cannot avoid it but can only redistribute it.
And that is the key. We can't avoid complexity, but we can push it around. We can bury it out of sight. This is the old "black box" principle at work; a black box performs a complex task, but it possesses simplicity on the outside.
If you haven't already lost patience with my quotations, a word from Albert Einstein is appropriate here: "Everything should be as simple as possible, but no simpler."
So in Ruby we see simplicity embodied from the programmer's view (if not from the view of those maintaining the interpreter). Yet we also see the capacity for compromise. In the real world, we must bend a little. For example, every entity in a Ruby program should be a true object, but certain values such as integers are stored as immediate values. In a trade-off familiar to computer science students for decades, we have traded elegance of design for practicality of implementation. In effect, we have traded one kind of simplicity for another.
What Larry Wall said about Perl holds true: "When you say something in a small language, it comes out big. When you say something in a big language, it comes out small." The same is true for English. The reason that biologist Ernst Haeckel could say "Ontogeny recapitulates phylogeny" in only three words was that he had these powerful words with highly specific meanings at his disposal. We allow inner complexity of the language because it enables us to shift the complexity away from the individual utterance.
I would state this guideline this way: Don't write 200 lines of code when 10 will do.
I'm taking it for granted that brevity is generally a good thing. A short program fragment will take up less space in the programmer's brain; it will be easier to grasp as a single entity. As a happy side effect, fewer bugs will be injected while the code is being written.
Of course, we must remember Einstein's warning about simplicity. If we put brevity too high on our list of priorities, we will end up with code that is hopelessly obfuscated. Information theory teaches us that compressed data is statistically similar to random noise; if you have looked at C or APL or regular expression notationespecially badly writtenyou have experienced this truth firsthand. "Simple, but not too simple"; that is the key. Embrace brevity, but do not sacrifice readability.
It is a truism that both brevity and readability are good. But there is an underlying reason for this, one so fundamental that we sometimes forget it. The reason is that computers exist for humans, not humans for computers.
In the old days, it was almost the opposite. Computers cost millions of dollars and ate electricity at the rate of many kilowatts. People acted as though the computer were a minor deity and the programmers were humble supplicants. An hour of the computer's time was more expensive than an hour of a person's time.
When computers became smaller and cheaper, high-level languages also became more popular. These were inefficient from the computer's point of view but efficient from the human perspective. Ruby is simply a later development in this line of thought. Some, in fact, have called it a VHLL (Very High-Level Language); though this term is not well-defined, I think its use is justified here.
The computer is supposed to be the servant, not the master, and, as Matz has said, a smart servant should do a complex task with a few short commands. This has been true through all the history of computer science. We started with machine languages and progressed to assembly language and then to high-level languages.
What we are talking about here is a shift from a machine-centered paradigm to a human-centered one. In my opinion, Ruby is an excellent example of human-centric programming.
I'll shift gears a little. There was a wonderful little book from the 1980s called The Tao of Programming (by Geoffrey James). Nearly every line is quotable, but I'll repeat only this: "A program should follow the 'Law of Least Astonishment.' What is this law? It is simply that the program should always respond to the user in the way that astonishes him least." (Of course, in the case of a language interpreter, the user is the programmer.)
I don't know whether James coined this term, but his book was my first introduction to the phrase. This is a principle that is well known and often cited in the Ruby community, though it is usually called the Principle of Least Surprise or POLS. (I myself stubbornly prefer the acronym LOLA.)
Whatever you call it, this rule is a valid one, and it has been a guideline throughout the ongoing development of the Ruby language. It is also a useful guideline for those who develop libraries or user interfaces.
The only problem, of course, is that different people are surprised by different things; there is no universal agreement on how an object or method "ought" to behave. We can strive for consistency and strive to justify our design decisions, and each person can train his own intuition.
For the record, Matz has said that "least surprise" should refer to him as the designer. The more you think like him, the less Ruby will surprise you. And I assure you, imitating Matz is not a bad idea for most of us.
No matter how logically constructed a system may be, your intuition needs to be trained. Each programming language is a world unto itself, with its own set of assumptions, and human languages are the same. When I took German, I learned that all nouns were capitalized, but the word deutsch was not. I complained to my professor; after all, this was the name of the language, wasn't it? He smiled and said, "Don't fight it."
What he taught me was to let German be German. By extension, that is good advice for anyone coming to Ruby from some other language. Let Ruby be Ruby. Don't expect it to be Perl, because it isn't; don't expect it to be LISP or Smalltalk, either. On the other hand, Ruby has common elements with all three of these. Start by following your expectations, but when they are violated, don't fight it. (Unless Matz agrees it's a needed change.)
Every programmer today knows the orthogonality principle (which would better be termed the orthogonal completeness principle). Suppose we have an imaginary pair of axes with a set of comparable language entities on one and a set of attributes or capabilities on the other. When we talk of "orthogonality," we usually mean that the space defined by these axes is as "full" as we can logically make it.
Part of the Ruby Way is to strive for this orthogonality. An array is in some ways similar to a hash; so the operations on each of them should be similar. The limit is reached when we enter the areas where they are different.
Matz has said that "naturalness" is to be valued over orthogonality. But to fully understand what is natural and what is not may take some thinking and some coding.
Ruby strives to be friendly to the programmer. For example, there are aliases or synonyms for many method names; size and length will both return the number of entries in an array. The variant spellings indexes and indices both refer to the same method. Some consider this sort of thing to be an annoyance or anti-feature, but I consider it a good design.
Ruby strives for consistency and regularity. There is nothing mysterious about this; in every aspect of life, we yearn for things to be regular and parallel. What makes it a little more tricky is learning when to violate this principle.
For instance, Ruby has the habit of appending a question mark (?) to the name of a predicatelike method. This is well and good; it clarifies the code and makes the namespace a little more manageable. But what is more controversial is the similar use of the exclamation point in marking methods that are "destructive" or "dangerous" in the sense that they modify their receivers. The controversy arises because not all of the destructive methods are marked in this way. Shouldn't we be consistent?
No, in fact we should not. Some of the methods by their very nature change their receiver (such as the Array methods replace and concat). Some of them are "writer" methods allowing assignment to a class attribute; we should not append an exclamation point to the attribute name or the equal sign. Some methods arguably change the state of the receiver, such as read; this occurs too frequently to be marked in this way. If every destructive method name ended in a !, our programs soon would look like sales brochures for a multilevel marketing firm.
Do you notice a kind of tension between opposing forces, a tendency for all rules to be violated? Let me state this as Fulton's Second Law: Every rule has an exception, except Fulton's Second Law. (Yes, there is a joke there, a very small one.)
What we see in Ruby is not a "foolish consistency" nor a rigid adherence to a set of simple rules. In fact, perhaps part of the Ruby Way is that it is not a rigid and inflexible approach. In language design, as Matz once said, you should "follow your heart."
Yet another aspect of the Ruby philosophy is: Do not fear change at runtime; do not fear what is dynamic. The world is dynamic; why should a programming language be static? Ruby is one of the most dynamic languages in existence.
I would also argue that another aspect is: Do not be a slave to performance issues. When performance is unacceptable, the issue must be addressed, but it should normally not be the first thing you think about. Prefer elegance over efficiency where efficiency is less than critical. Then again, if you are writing a library that may be used in unforeseen ways, performance may be critical from the start.
When I look at Ruby, I perceive a balance between different design goals, a complex interaction reminiscent of the n-body problem in physics. I can imagine it might be modeled as an Alexander Calder mobile. It is perhaps this interaction itself, the harmony, that embodies Ruby's philosophy rather than just the individual parts. Programmers know that their craft is not just science and technology but art. I hesitate to say that there is a spiritual aspect to computer science, but just between you and me, there certainly is. (If you have not read Robert Pirsig's Zen and the Art of Motorcycle Maintenance, I recommend that you do so.)
Ruby arose from the human urge to create things that are useful and beautiful. Programs written in Ruby should spring from that same God-given source. That, to me, is the essence of the Ruby Way.