Section 15.4. Creating PDF Documents with PDF::Writer | The Ruby Way, Second Edition: Solutions and Techniques in Ruby Programming (2nd Edition)

15.3. Manipulating Image Data with RMagick

In the last fifteen years, we have been bombarded with more and more graphical information. Computers have now surpassed television sets as the primary supplier of "eye candy" in all its forms. This means that programmers need ways of manipulating all kinds of image data in multiple conflicting formats. In Ruby, the best way to do this is with RMagick, a library created by Tim Hunter.

RMagick is a Ruby binding for the ImageMagick library (or its branch, GraphicsMagick). RMagick is installable as a gem, but to use RMagick, you must have one of these libraries (IM or GM) installed first. If you are on Linux, you probably already have one or the other; if not, you can go to http://imagemagick.org (or http://graphicsmagick.org).

RMagick is just a binding, of course; asking what image formats it supports is the same as asking what image formats are supported by the underlying library. Those, by the way, are all the common ones such as JPG, GIF, PNG, and TIFF, but also dozens of others.

The same is true for the operations RMagick can do. It is essentially limited only by the capabilities of the library it uses; the full API is implemented in RMagick. The API, by the way, is not only rich and complete but overall is a good example of a "Ruby-like" API; it uses symbols, blocks, and method prefixes in a normal way, and most Ruby programmers will find it has an intuitive feel.

The API is really huge, by the way. This chapter would not be enough to cover it in detail, nor would this book. The upcoming sections will give you a good background in RMagick, however, and you can find out anything else you may need from the project website (http://rmagick.rubyforge.org).

15.3.1. Common Graphics Tasks

One of the easiest and most common tasks you might want to perform on an image file is simply to determine its characteristics (width and height in pixels, and so on). Let's look at retrieving a few of these pieces of metadata.

Figure 15.1 shows a pair of simple images that we'll use for this code example (and later examples in the next section). The first one (smallpic.jpg) is a simple abstract picture created with a drawing program; it features a few different shades of gray, a few straight lines, and a few curves. The second is a photograph I took in 2002 of a battered automobile in rural Mexico. Both images were converted to grayscale for printing purposes. Listing 15.6 shows how to read these images and extract a few pieces of information.

Figure 15.1. Two sample image files.

Listing 15.6. Retrieving Information from an Image

require 'RMagick' def show_info(fname)   img = Magick::Image::read(fname).first   fmt = img.format   w,h = img.columns, img.rows   dep = img.depth   nc  = img.number_colors   nb  = img.filesize   xr  = img.x_resolution   yr  = img.y_resolution   res = Magick::PixelsPerInchResolution ? "inch" : "cm"   puts <<-EOF   File:       #{fname}   Format:     #{fmt}   Dimensions: #{w}x#{h} pixels   Colors:     #{nc}   Image size: #{nb} bytes   Resolution: #{xr}/#{yr} pixels per #{res}   EOF   puts end show_info("smallpic.jpg") show_info("vw.jpg")

Here is the output of the Listing 15.6 code:

File:       smallpic.jpg Format:     JPEG Dimensions: 257x264 pixels Colors:     248 Image size: 19116 bytes Resolution: 72.0/72.0 pixels per inch File:       vw.jpg Format:     JPEG Dimensions: 640x480 pixels Colors:     256 Image size: 55892 bytes Resolution: 72.0/72.0 pixels per inch

Now let's examine the details of how the code in Listing 15.6 gave us that output. Notice how we retrieve all the contents of a file with Magick::Image::read; because a file (such as an animated GIF) can contain more than one image, this operation actually returns an array of images (and we look at the first one by calling first). We can also use Magick::ImageList.new to read an image file.

The image object has a number of readers such as format (the name of the image format), filesize, depth, and others that are intuitive. It may be less intuitive that the width and height of the object are retrieved by columns and rows, respectively (because we are supposed to think of an image as a grid of pixels). It may not be intuitive to you that the resolution is actually stored as two numbers, but it is (because apparently it can differ horizontally and vertically).

There are other properties and pieces of metadata you can retrieve from an image. Refer to the online documentation for RMagick for more details.

One common task we often perform is to convert an image from one format to another. The easy way to do this in RMagick is to read an image in any supported format and then write it to another file. The file extension is used to determine the new format. Needless to say, it does a lot of conversion behind the scenes. Here is a simple example:

img = Magick::Image.read("smallpic.jpg") img.write("smallpic.gif")                  # Convert to a GIF

Frequently we want to change the size of an image (smaller or larger). The four most common methods for this are thumbnail, resize, sample, and scale. All four of these can take either a floating point number (representing a scaling factor) or a pair of numbers (representing the actual new dimensions in pixels). Other differences are summarized in Listing 15.7 and its comments. Where speed is an issue, I urge you to do your own tests on your own machine with your own data.

Listing 15.7. Four Ways to Resize an Image

require 'RMagick' img = Magick::ImageList.new("vw.jpg") # Each of these methods can take a single "factor" parameter # or a width,height pair # thumbnail is fastest, especially when reducing to a very small size pic1 = img.thumbnail(0.2)      # Reduce to 20% pic2 = img.thumbnail(64,48)    # Reduce to 64x48 pixels # resize is of medium speed. If a 3rd and/or 4th parameter are # specified, they are the filter and blur, respectively. The # filter defaults to LanczosFilter and blur to 1.0 pic3 = img.resize(0.40)        # Reduce to 40% pic4 = img.resize(320,240)     # Reduce to 320x240 pic5 = img.resize(300,200,Magick::LanczosFilter,0.92) # sample is also of medium speed (and doesn't do color interpolation) pic6 = img.sample(0.35)        # Reduce to 35% pic7 = img.sample(320,240)     # Reduce to 320x240 # scale is slowest in my tests pic8 = img.scale(0.60)         # Reduce to 60% pic9 = img.scale(400,300)      # Reduce to 400x300

Many other transformations can be performed on an image. Some of these are simple and easy to understand, whereas others are complex. We'll explore a few interesting transformations and special effects in the next section.

15.3.2. Special Effects and Transformations

Some operations we might want to do on an image are to flip it, reverse it, rotate it, distort it, alter its colors, and so on. RMagick provides literally dozens of methods to perform such operations, and many of these are highly "tunable" by their parameters.

Listing 15.8 demonstrates 12 different effects. To make the code a little more concise, the method example simply takes a filename, a symbol corresponding to a method, and a new filename; it basically does a read, a method call, and a write. The individual methods (such as do_rotate) are simple for the most part; these are where the image passed in gets an actual instance method called (and then the resulting image is the return value).

Listing 15.8. Twelve Special Effects and Transformations

require 'RMagick' def do_flip(img)   img.flip end def do_rotate(img)   img.rotate(45) end def do_implode(img)   img = img.implode(0.65) end def do_resize(img)   img.resize(120,240) end def do_text(img)   text = Magick::Draw.new   text.annotate(img, 0, 0, 0, 100, "HELLO") do     self.gravity = Magick::SouthGravity     self.pointsize = 72     self.stroke = 'black'     self.fill = '#FAFAFA'     self.font_weight = Magick::BoldWeight     self.font_stretch = Magick::UltraCondensedStretch   end   img end def do_emboss(img)   img.emboss end def do_spread(img)   img.spread(10) end def do_motion(img)   img.motion_blur(0,30,170) end def do_oil(img)   img.oil_paint(10) end def do_charcoal(img)   img.charcoal end def do_vignette(img)   img.vignette end def do_affine(img)  spin_xform = Magick::AffineMatrix.new(1, Math::PI/6, Math::PI/6, 1, 0, 0)   img.affine_transform(spin_xform)              # Apply the transform end ### def example(old_file, meth, new_file)   img = Magick::ImageList.new(old_file)   new_img = send(meth,img)   new_img.write(new_file) end example("smallpic.jpg", :do_flip,    "flipped.jpg") example("smallpic.jpg", :do_rotate,  "rotated.jpg") example("smallpic.jpg", :do_resize,  "resized.jpg") example("smallpic.jpg", :do_implode, "imploded.jpg") example("smallpic.jpg", :do_text,    "withtext.jpg") example("smallpic.jpg", :do_emboss,  "embossed.jpg") example("vw.jpg", :do_spread,   "vw_spread.jpg") example("vw.jpg", :do_motion,   "vw_motion.jpg") example("vw.jpg", :do_oil,      "vw_oil.jpg") example("vw.jpg", :do_charcoal, "vw_char.jpg") example("vw.jpg", :do_vignette, "vw_vig.jpg") example("vw.jpg", :do_affine,   "vw_spin.jpg")

The methods used here are flip, rotate, implode, resize, annotate, and others. The results are shown in Figure 15.2 in a montage.

Figure 15.2. Twelve special effects and transformations.

Many other transformations can be performed on an image. Consult the online documentation at http://rmagick.rubyforge.org.

15.3.3. The Drawing API

RMagick has an extensive drawing API for drawing lines, polygons, and curves of various kinds. It deals with filling, opacity, colors, text fonts, rotating/skewing, and other issues.

A full treatment of the API is beyond the scope of this book. Let's look at a simple example, however, to understand a few concepts.

Listing 15.9 shows a program that draws a simple grid on the background and draws a few filled shapes on that grid. The image is converted to grayscale, resulting in the image shown in Figure 15.3.

Listing 15.9. A Simple Drawing

require 'RMagick' img = Magick::ImageList.new img.new_image(500, 500) purplish = "#ff55ff" yuck = "#5fff62" bleah = "#3333ff" line = Magick::Draw.new 50.step(450,50) do |n|   line.line(n,50, n,450)  # vert line   line.draw(img)   line.line(50,n, 450,n)  # horiz line   line.draw(img) end # Draw a circle cir = Magick::Draw.new cir.fill(purplish) cir.stroke('black').stroke_width(1) cir.circle(250,200, 250,310) cir.draw(img) rect = Magick::Draw.new rect.stroke('black').stroke_width(1) rect.fill(yuck) rect.rectangle(340,380,237,110) rect.draw(img) tri = Magick::Draw.new tri.stroke('black').stroke_width(1) tri.fill(bleah) tri.polygon(90,320,160,370,390,120) tri.draw(img) img = img.quantize(256,Magick::GRAYColorspace) img.write("drawing.gif")

Figure 15.3. A simple drawing.

Let's examine Listing 15.9 in detail. We start by creating an "empty" image with ImageList.new and then calling new_image on the result. Think of this as giving us a "blank canvas" of the specified size (500 by 500 pixels).

For convenience, let's define a few colors (with creative names such as purplish and yuck). These are strings that specify colors just as we would in HTML. The underlying xMagick library is also capable of understanding many color names such as "red" and "black"; when in doubt, experiment or specify the color in hex.

We then create a drawing object called line; this is the Ruby object corresponding to the graphical object that we will see on the screen. The variable is sometimes named gc or something similar (probably standing for graphics context), but a more descriptive name seems natural here.

We then call the method line on our drawing object (which admittedly gets a little confusing). In fact, we call it repeatedly, twice in each iteration of a loop. If you spend a moment studying the coordinates, you'll see that each iteration of the loop draws a horizontal line and a vertical one.

After each line call, we call draw on the same drawing object and pass in the image reference. This is an essential step because it is when the graphical object actually gets added to the canvas.

If you are like me, a call such as shape.draw(image) may be a little confusing. In general, my method calls look like:

big_thing.operation(little_thing) # For example: dog.wag(tail)

But this call feels to me more like:

little_thing.operation(big_thing) # Continuing the analogy: tail.wag(dog)

But this idiom is actually common, especially in the realm of drawing programs and GUI frameworks. And it makes perfect sense in a classic OOP way: A shape should know how to draw itself, implying it should have a draw method. It needs to know where to draw itself, so it needs the canvas (or whatever) passed in.

But if you're not like me, you were never bothered by the question of which object should be the receiver. That puts you at a tiny advantage.

So after we draw the grid of lines, we then draw a few shapes. The circle method takes the center of the circle and a point on the circle as parameters. (Notice we don't draw by specifying the radius!) The rectangle method is even simpler; we draw it by specifying the upper left-hand corner (lower-numbered coordinates) and the lower right-hand corner (higher-numbered coordinates). Finally, we draw a triangle, which is just a special case of a polygon; we specify each point in order, and the final line (from end point to start point) is added automatically.

Each of these graphical objects has a few methods called that we haven't looked at yet. Look at this "chained" call:

shape.stroke('black').stroke_width(1)

This gives us a "pen" of sorts; it draws in black ink with a width of 1 pixel. The color of the stroke actually does matter in many cases, especially when we are trying to fill a shape with a color.

That of course is the other method we call on our three shapes. We call fill to specify what color it should have. (There are other more complex kinds of filling involving hatching, shading, and so on.) The fill method replaces the interior color of the shape with the specified color, knowing that the stroke color serves as a boundary between "inside" and "outside" the shape.

Numerous other methods in the drawing API deal with opacity, spatial transformations, and many more things. There are methods that analyze, draw, and manipulate graphical text strings. There is even a special RVG API (Ruby Vector Graphics) that is conformant to the W3C recommendation on SVG (Scalable Vector Graphics).

There is no room here to document these many features. For more information, go to http://rmagick.rubyforge.org as usual.