Section 10.5. Conclusion | The Ruby Way, Second Edition: Solutions and Techniques in Ruby Programming (2nd Edition)

10.4. Connecting to External Databases

Ruby can interface to various databases, thanks to the development work of many different people. These range from monolithic systems such as Oracle down to the more petite MySQL. We have included CSV here for some measure of completeness.

The level of functionality provided by these packages will change continually. Be sure to refer to an online reference for the latest information. The Ruby Application Archive is always a good starting point.

10.4.1. Interfacing to SQLite

SQLite is a popular database for those who appreciate zero configuration software. It is a small self-contained executable, written in C, which can handle a complete database in a single file. Although it is usually used for small databases, it can deal with data up into the terabyte range.

The Ruby bindings for SQLite are relatively straightforward. The C API is wrapped within the SQLite::API class; because the wrapping is basically one-to-one and not particularly object-oriented, you should use this API directly only when absolutely necessary.

In most situations, the SQLite::Database class will meet all your needs. Here is a brief piece of sample code:

require 'sqlite' db = SQLite::Database.new("library.db") db.execute("select title,author from books") do |row|   p row end db.close # Output: # ["The Case for Mars", "Robert Zubrin"] # ["Democracy in America", "Alexis de Tocqueville"] # ...

If a block is not specified, execute returns an instance of ResultSet (in essence a cursor that can be iterated over).

rs = db.execute("select title,author from books") rs.each {|row| p row }    # Same results as before rs.close

If a ResultSet is returned, the user's code is responsible for closing it (as shown in the preceding code example). If you need to traverse the list of records more than once, you can do a reset to start over. (This feature is experimental and may change.) Additionally, you can do generator-style iteration (external iteration) with next and eof? if you want.

rs = db.execute("select title,author from books") while ! rs.eof?   rec = rs.next   p rec                   # Same results as before end rs.close

An extensive set of exceptions may be raised by this library's methods. All are subclasses of SQLite::Exception so that it is easy to catch any or all of them.

It should be noted briefly that this library is written to interoperate well with Ara Howard's ArrayFields library (which we don't cover here). This is essentially a way to let array elements be indexed or accessed by name as well as by an integer value. If the arrayfields library is required before the sqlite library, a ResultSet object can be indexed by name as well as by number. (However, this is configurable to return a Hash instead.)

Although the sqlite library is fairly full-featured, it does not cover all the functionality you might want, simply because SQLite itself does not completely implement the SQL92 standard. For more information on SQLite or its Ruby binding, do a web search.

10.4.2. Interfacing to MySQL

Ruby's MySQL interface is among the most stable and fully functional of its database interfaces. It is an extension and must be installed after both Ruby and MySQL are installed and running.

There are three steps to using this module after you have it installed. First load the module in your script; then connect to the database; and finally work with your tables. Connecting requires the usual parameters for host, username, password, database, and so on.

require 'mysql' m = Mysql.new("localhost","ruby","secret","maillist") r = m.query("SELECT * FROM people ORDER BY name") r.each_hash do |f|   print "#{f['name']} - #{f['email']}" end # Output looks like: # John Doe - jdoe@rubynewbie.com # Fred Smith - smithf@rubyexpert.com

The class methods Mysql.new and MysqlRes.each_hash are useful, along with the instance method query.

The module is composed of four classes (Mysql, MysqlRes, MysqlField, and MysqlError), as described in the README. We summarize some useful methods here, but you can always find more information in the actual documentation.

The class method Mysql.new takes several string parameters, all defaulting to nil, and returns a connection object. The parameters are host, user, passwd, db, port, sock, and flag. Aliases for new are real_connect and connect.

The methods create_db, select_db, and drop_db all take a database name as a parameter; they are used as shown here. The method close closes the connection to the server.

m=Mysql.new("localhost","ruby","secret") m.create_db("rtest")    # Create a new database m.select_db("rtest2")   # Select a different database m.drop_db("rtest")      # Delete a database m.close                 # Close the connection

In recent versions, create_db and drop_db are considered obsolete. But you can make them work by defining them in this way:

class Mysql   def create_db(db)     query("CREATE DATABASE #{db}")   end   def drop_db(db)     query("DROP DATABASE #{db}")   end end

Method list_dbs returns a list of available database names in an array.

dbs = m.list_dbs       # ["people","places","things"]

The query takes a string parameter and returns a MysqlRes object by default. Depending on how query_with_result is set, it may return a Mysql object.

In the event of an error, the error number can be retrieved by errno; error, on the other hand, returns the actual error message.

begin   r=m.query("create table rtable    (      id int not null auto_increment,      name varchar(35) not null,      desc varchar(128) not null,      unique id(id)    )") # exception happens... rescue   puts m.error   # Prints: You have an error in your SQL syntax   # near 'desc varchar(128) not null ,   #   unique id(id)   # )' at line 5"   puts m.errno   # Prints 1064   # ('desc' is reserved for descending order) end

A few useful instance methods of MysqlRes are summarized here:

fetch_fields Returns an array of MysqlField objects from the next row
fetch_row Returns an array of field values from the next row
fetch_hash(with_table=false) Returns a hash containing the next row's field names and values
num_rows Returns number of rows in the resultset
each Iterator that sequentially returns array of field values
each_hash(with_table=false) Iterator that sequentially returns hash of {fieldname => fieldvalue} (Use x['field name'] to get the field value.)
These are some instance methods of MysqlField:
name Returns name of designated field
table Returns name of table to which designated field belongs
length Returns defined length of field
max_length Returns length of longest field from resultset
hash Returns hash with name and values for name, table, def, type, length, max_length, flags, and decimals

Any material here is always superseded by online documentation. For more information, see the MySQL website (http://www.mysql.com) and the Ruby Application Archive.

10.4.3. Interfacing to PostgreSQL

There is an extension available from the RAA that provides access to PostgreSQL also (it works with PostgreSQL 6.5/7.0).

Assuming you already have PostgreSQL installed and set up (and you have a table named testdb), you merely need to follow essentially the same steps as used with other database interfaces in Ruby: Load the module, connect to the database, and then do your work with the tables. You'll probably want a way of executing queries, getting the results of a select back, and working with transactions.

require 'postgres' conn = PGconn.connect("",5432, "", "", "testdb") conn.exec("create table rtest ( number integer default 0 );") conn.exec("insert into rtest values ( 99 )") res = conn.query("select * from rtest") # res id [["99"]]

The PGconn class contains the connect method, which takes the typical database connection parameters such as host, port, database, username, and login; but it also takes options and tty parameters in positions three and four. We have connected in our example to the UNIX socket via a privileged user, so we did not need a username or password, and the host, options, and tty were left empty. The port must be an integer, whereas the others are strings. An alias for this is the new method.

The next thing of interest is working with our tables; this requires some means to do queries. The instance methods PGconn#exec and PGconn#query are just what we need.

The exec method sends its string parameter as an SQL query request to PostgreSQL and returns a PGresult instance on success. On failure, it raises a PGError exception.

The query method also sends its string parameter as an SQL query request to PostgreSQL. However, it returns an array on success. The returned array is actually an array of tuples. On failure, it returns nil, and error details can be obtained by the error method call.

There is a special method for inserting values into a specific table, called insert_table. Despite the name, it actually means "insert into table." This method returns a PGconn object.

conn.insert_table("rtest",[[34]]) res = conn.query("select * from rtest") # res is [["99"], ["34"]]

This inserts one row of values into the table rtest. For our simple example, there is only one column to begin with. Notice that the PGresult object res shows updated results with two tuples. We will discuss PGresult methods shortly.

Other potentially useful methods from the PGconn class include the following:

db Returns the connected database name.
host Returns the connected server name.
user Returns the authenticated username.
error Returns the error message about connection.
finish,close Closes the backend connection.
loimport(file) Imports a file to a large object; returns the PGlarge instance on success. On failure, raise the PGError exception.
loexport(oid, file) Saves a large object of oid to a file.
locreate([mode]) Returns the PGlarge instance on success. On failure, raise the PGError exception.
loopen(oid, [mode]) Opens a large object of oid; returns the PGlarge instance on success. The mode argument specifies the mode for the opened large object, which is either "INV_READ" or "INV_WRITE" (if the mode is omitted, the default is "INV_READ").
lounlink(oid) Unlinks (deletes) the Postgres large object of oid.

Notice that the last five methods in the preceding list (loimport, loexport, locreate, loopen, and lounlink) involve objects of the PGlarge class. The PGlarge class has specific methods for accessing and changing its own objects. (The objects are created as a result of the PGconn instance methods loimport, locreate, and loopen, discussed in the preceding list.)

The following is a list of PGlarge methods:

open([mode]) Opens a large object. The mode argument specifies its mode as with the preceding PGconn#loopen).
close Closes a large object (also closed when they are garbage-collected).
read([length]) Attempts to read "length" bytes from large object. If no length is given, all data is read.
write(str) Writes the string to the large object and returns the number of bytes written.
tell Returns the current position of the pointer.
seek(offset, whence) Moves the pointer to offset. The possible values for whence are SEEK_SET, SEEK_CUR, and SEEK_END (or 0,1,2).
unlink Deletes large object.
oid Returns the large object oid.
size Returns the size of large object.
export(file) Saves a large object of oid to a file.

Of more interest to us are the instance methods (shown in the list below) of the PGresult class, which are created as the result of queries. (Use PGresult#clear when finished with these objects to improve memory performance.)

result Returns the query result tuple in the array.
each Iterator
[] Accessor
fields Returns the array of the fields of the query result.
num_tuples Returns the number of tuples of the query result.
fieldnum(name) Returns the index of the named field.
type(index) Returns an integer corresponding the type of the field.
size(index) Returns the size of the field in bytes. 1 indicates the field is variable length.
getvalue(tup_num, field_num) Returns the field value for the given parameters. tup_num is the same as row number.
getlength(tup_num, field_num) Returns the length of the field in bytes.
cmdstatus Returns the status string of the last query command.
clear Clears the PGresult object.

10.4.4. Interfacing to LDAP

There are at least three usable LDAP libraries for Ruby. The Ruby/LDAP library by Takaaki Tateishi is a fairly "thin" wrapper. If you are an LDAP expert, this might be sufficient for your needs; if not, you may find it a little complex. Here is an example:

conn = LDAP::Conn.new("rsads02.foo.com") conn.bind("CN=username,CN=Users,DC=foo,DC=com","password") do |bound|   bound.search("DC=foo,DC=com", LDAP::LDAP_SCOPE_SUBTREE,                "(&(name=*) (objectCategory=person))", ['name','ipPhone']) do |user|     puts "#{user['name']} #{user['ipPhone']}"   end end

ActiveLDAP is patterned, not surprisingly, after ActiveRecord. Here is a sample usage of it, taken from its home page:

require 'activeldap' require 'examples/objects/user' require 'password' # Instantiate Ruby/ActiveLDAP connection, etc ActiveLDAP::Base.connect(:password_block => Proc.new { Password.get('Password: ') },                               :allow_anonymous => false) # Load a user record (Class defined in the examples) wad = User.new('wad') # Print the common name p wad.cn # Change the common name wad.cn = "Will" # Save it back to LDAP wad.write

There is also a newer library by Francis Cianfrocca, which is preferred by some. Here is a sample of its usage:

require 'net/ldap' ldap = Net::LDAP.new :host => server_ip_address,      :port => 389,      :auth => {            :method => :simple,            :username => "cn=manager,dc=example,dc=com",            :password => "opensesame"      } filter = Net::LDAP::Filter.eq( "cn", "George*" ) treebase = "dc=example,dc=com" ldap.search( :base => treebase, :filter => filter ) do |entry|   puts "DN: #{entry.dn}"   entry.each do |attribute, values|     puts "   #{attribute}:"     values.each do |value|       puts "      --->#{value}"     end   end end p ldap.get_operation_result

It is a matter of opinion which of these libraries is better. I urge you to investigate all of them and form your own opinion.

10.4.5. Interfacing to Oracle

Oracle is one of the most powerful and popular database systems in the world. Naturally there have been multiple attempts to interface this database with Ruby. Currently the best library out there is OCI8, the work of Kubo Takehiro.

Despite the name, OCI8 works well with versions of Oracle later than 8. However, because it is not completely mature, it may not offer access to all the newest features of later versions.

The API is split into a "thin" wrapper (the low-level API that closely follows the Oracle Call Interface API) and a higher-level API. In most cases, the higher-level API is the only one you need concern yourself with. In the future, the low-level API may be undocumented.

The OCI8 module encompasses the Cursor and Blob classes. The OCIException class is the superclass of the three classes of exceptions that may be thrown during a database operation: OCIError, OCIBreak, and OCIInvalidHandle.

Connect with the database with OCI8.new, at minimum passing in a username and password. A handle is returned that you can then use to perform queries. Here is an example:

require 'oci8' session = OCI8.new('user', 'password') query = "SELECT TO_CHAR(SYSDATE, 'YYYY/MM/DD') FROM DUAL" cursor = session.exec(query) result = cursor.fetch         # Only one iteration in this case cursor.close session.logoff

The preceding example shows a cursor being manipulated, although we only do a single fetch against it before we close it. Of course, we can retrieve multiple rows also:

query = 'select * from some_table' cursor = session.exec(query) while row = cursor.fetch   puts row.join(",") end cursor.close # Or with a block: nrows = session.exec(query) do |row|   puts row.join(",") end

Bind variables "look like" symbols in the query. There are multiple ways to associate these bind variables with actual values.

session = OCI8.new("user","password") query = "select * from people where name = :name" # One way... session.exec(query,'John Smith') # Another... cursor = session.parse(query) cursor.exec('John Smith') # And another... cursor = session.parse(query) cursor.bind_param(':name','John Smith')  # bind by name cursor.exec # And another. cursor = session.parse(query) cursor.bind_param(1,'John Smith')        # bind by position cursor.exec

There is also a DBI adapter for those who prefer DBI. For more information, consult the OCI8 documentation.

10.4.6. Using the DBI Wrapper

In theory, DBI allows database-independent access to your database. That is, the code should work the same whether the underlying database is Oracle, MySQL, PostgreSQL, or something else. Normally only a single line of code should change, specifying which adapter to use. Sometimes DBI breaks down in the face of complex or database-specific operations, but for most day-to-day work, it is a convenient tool.

Let's assume we have an Oracle database, and we are using the driver or adapter that comes with OCI8. The connect method is given enough information to connect successfully with the database. After that, all is mostly intuitive.

require "dbi" db = DBI.connect("dbi:OCI8:mydb", "user", "password") query = "select * from people" stmt = db.prepare(query) stmt.execute while row = stmt.fetch do   puts row.join(",") end stmt.finish db.disconnect

In the preceding example, the prepare can be thought of as compiling or parsing the query, which is then executed. The fetch method retrieves a single row from the resultset, returning nil when there are no more rows (hence the while loop in the preceding code). The finish can be thought of as a close or deallocate operation.

For a full treatment of all the features of DBI, consult any reference. For a list of all database drivers, refer to sources such as RubyForge and the RAA.

10.4.7. Object-Relational Mappers (ORMs)

The traditional relational database is good at what it does. It handles queries in an efficient way without foreknowledge of the nature of those ad hoc queries. But this model is not very object-oriented.

The ubiquity of both these models (RDBMS and OOP) and the "impedance mismatch" between them has led many people to try to bridge this gap. The software bridge that accomplishes this is called an Object-Relational Mapper (ORM).

There are many ways of approaching this problem. All have their advantages and disadvantages. Here we'll take a short look at two popular ORMs, ActiveRecord and Og (the latter of which stands for object graph).

The ActiveRecord library for Ruby is named after Martin Fowler's "Active Record" design pattern. In essence, it ties database tables to classes so that the data can be manipulated intuitively without SQL. To be more specific, it "wraps a row in a database table or view, encapsulates the database access, and adds domain logic on that data" (see Patterns of Enterprise Application Architecture by Martin Fowler, Addison Wesley, 2003 [ISBN: 0-321-12742-0e]).

Each table is described by inheriting from ActiveRecord::Base and then customizing the class. As with DBI, we have to connect giving it enough information to identify and reach the database. Here is a short example of how it all works:

require 'active_record' ActiveRecord::Base.establish_connection(:adapter => "oci8",                                         :username => "username",                                         :password => "password",                                         :database => "mydb",                                         :host => "myhost") class SomeTable < ActiveRecord::Base   set_table_name "test_table"   set_primary_key "some_id" end SomeTable.find(:all).each do |rec|   # process rec as needed... end item = SomeTable.new item.id = 1001 item.some_column = "test" item.save

The API is rich and complex. I recommend that you study whatever tutorials you can find on the Web or in books. Because this library is an integral part of "Ruby on Rails," it will be covered in materials on that topic.

Og is different from ActiveRecord in the sense that the latter is more database-centric, and the former is more object-centric. Og can generate a database schema from Ruby class definitions (rather than vice versa).

Og is a different way of thinking, and it is less common than ActiveRecord. But I believe it has interesting qualities and should be considered a powerful, usable ORM, especially if you design your database after your objects.

When we define a class to be stored, we use the property method, which is like attr_accessor except that it has a type (class) associated with it.

class SomeClass   property :alpha, String   property :beta, String   property :gamma, String end

Other data types are Integer, Float, Time, Date, and others. It is also potentially possible to have a property that is an arbitrary Ruby object.

Connect to the database much as you would with ActiveRecord or DBI.

db = Og::Database.new(:destroy  => false,                       :name => 'mydb',                       :store  => :mysql,                       :user     => 'hal9000',                       :password => 'chandra')

Every object has a save method that does an actual insert of the object into the database:

obj = SomeClass.new obj.alpha  = "Poole" obj.beta   = "Whitehead" obj.gamma  = "Kaminski" obj.save

There are also methods that describe object relationships in classical database terms:

class Dog   has_one :house   belongs_to :owner   has_many :fleas end

These methods, along with others such as many_to_many and refers_to, can assist in creating complex object and table relationships.

Og is also too large to document here. Refer to other resources online (such as those at http://oxyliquit.de) for more information.