Running External Programs

	Ruby Way By Hal Fulton Slots : 1.0
	Table of Contents

A language can't be a glue language unless it can run external programs. Ruby offers more than one way to do this.

We can't resist mentioning here that if you are going to run an external program, you should be certain you know what that program is doing. We're thinking about viruses and other potentially destructive programs here. Don't just run any old command string, especially if it came from a source outside the program. This is true regardless of whether the application is Web-based.

Using `system` and `exec`

The system method (in Kernel) is equivalent to the C call of the same name. It will execute the given command in a subshell.

 system("/usr/games/fortune") # Output goes to stdout as usual...

Note that the second parameter, if present, will be used as list of arguments; in most cases, the arguments can also be specified as part of the command string with the same effect. The only difference is that filename expansion is done on the first string but not on the others.

 system("rm", "/tmp/file1") system("rm /tmp/file2") # Both the above work fine. # However, below, there's a difference... system("echo *")    # Print list of all files system("echo","*")  # Print an asterisk (no filename                     # expansion done) # More complex command lines also work. system("ls -l | head -n 1")

Let's look at how this works on the Windows family of operating systems. For a simple executable, the behavior should be the same. Depending on your exact variant of Ruby, invoking a shell builtin might require a reference to cmd.exe, the Windows command processor (which might be command.com on some versions). Both cases, executable and builtin, are shown here:

 system("notepad.exe ","myfile.txt")  # No problem... system("cmd /c dir","somefile")     # 'dir' is a builtin!

Another solution to this is to use the Win32API library and define your own version of the system method.

 require "Win32API" def system(cmd)   sys = Win32API.new("crtdll", "system", ['P'], 'L')   sys.Call(cmd) end system("dir")  # cmd /c not needed!

So the behavior of system can be made relatively OS-independent. But, getting back to the big picture, if you want to capture the output (for example, in a variable), system of course isn't the right way (see the next section).

We'll also mention exec here. The exec method behaves much the same as system, except that the new process actually overlays or replaces the current one. Thus any code following the exec won't be executed.

 puts "Here's a directory listing:" exec("ls", "-l") puts "This line is never reached!"

Command Output Substitution

The simplest way to capture command output is to use the backtick (also called backquote or grave accent) to delimit the command. Here are a couple of examples:

 listing = `ls -l`  # Multiple lines in one string now = `date`       # "Mon Mar 12 16:50:11 CST 2001"

The generalized delimiter %x calls the backquote operator (which is really a Kernel method). It works essentially the same way:

 listing = %x(ls -l) now = %x(date)

The %x form is often useful when the string to be executed contains characters such as single and double quotes.

Because the backquote method really is a method (in some sense), it is possible to override it. Here we change the functionality so that we return an array of lines rather than a single string. Of course, we have to save an alias to the old method so that we can call it.

 alias old_execute ` def `(cmd)   out = old_execute(cmd)  # Call the old backtick method   out.split("\n")         # Return an array of strings! end entries = `ls -l /tmp` num = entries.size                    # 95 first3lines = %x(ls -l | head -n 3) how_many = first3lines.size           # 3

Note that, as we show here, the functionality of %x is affected when we perform this redefinition.

Here is another example. Here we append a "shellism" to the end of the command to ensure that standard error is mixed with standard output:

 alias old_execute ` def `(cmd)   old_execute(cmd + " 2>&1") end entries = `ls -l /tmp/foobar` # "/tmp/foobar: No such file or directory\n"

There are many other ways we could change the default behavior of the backquote.

Manipulating Processes

We mention process manipulation in this section even though a new process might or might not involve calling an external program. The principal way to create a new process is with the fork method. This takes its name from Unix tradition, from the idea of a fork in the path of execution, like a fork in the road.

The fork method in Kernel (also found in the Process module) shouldn't be confused with the Thread instance method of the same name.

There are two ways of invoking the fork method. The first is the more Unix-like way; we simply call it and test its return value. If that value is nil, we are in the child process; otherwise we execute the parent code. The value returned to the parent is actually the process ID (or pid) of the child.

 pid = fork if (pid == nil)   puts "Ah, I must be the child."   puts "I guess I'll speak as a child." else   puts "I'm the parent."   puts "Time to put away childish things." end

In this unrealistic example, the output might be interleaved or the parent's output might appear first. For purposes of this example, it's irrelevant.

We should also note that the child process might outlive the parent. We've seen that this isn't the case with Ruby threads, but system-level processes are entirely different.

The second form of fork takes a block. The code in the block comprises the child process. Our previous example could thus be rewritten in this simpler way:

 fork do   puts "Ah, I must be the child."   puts "I guess I'll speak as a child." end puts "I'm the parent." puts "Time to put away childish things."

The pid is still returned, of course. We just don't show it here.

When we want to wait for a process to finish, we can call the wait method in the Process module. It waits for any child to exit and returns the process ID of that child. The wait2 method will behave similarly except that it returns a two-value array consisting of the pid and a left-shifted exit status.

 pid1 = fork {  sleep 5; exit 3 } pid2 = fork {  sleep 2; exit 3 } Process.wait    # Returns pid2 Process.wait2   # Returns [pid1,768]

To wait for a specific child, use waitpid and waitpid2, respectively.

 pid3 = fork {  sleep 5; exit 3 } pid4 = fork {  sleep 2; exit 3 } Process.waitpid(pid4,Process::WNOHANG)     # Returns pid4 Process.waitpid2(pid3,Process:WNOHANG)     # Returns [pid3,768]

If the second parameter is unspecified, the call might block (if no such child exists). It might be ORed logically with Process::WUNTRACED to catch child processes that have been stopped. This second parameter is rather OS sensitive; experiment before relying on its behavior.

The exit! method will exit immediately from a process (bypassing any exit handlers). The integer value, if specified, will be returned as a return code; -1 (not 0) is the default.

 pid1 = fork {  exit! }      # Return -1 exit code pid2 = fork {  exit! 0 }    # Return 0 exit code

The pid and ppid methods will return the process ID of the current process and the parent process, respectively.

 proc1 = Process.pid fork do   if Process.ppid == proc1     puts "proc1 is my parent"  # Prints this message   else     puts "What's going on?"   end end

The kill method can be used to send a Unix-style signal to a process. The first parameter can be an integer, a POSIX signal name including the SIG prefix, or a non-prefixed signal name. The second parameter represents a pid; if it is zero, it refers to the current process.

 Process.kill(1,pid1)         # Send signal 1 to process pid1 Process.kill("HUP",pid2)     # Send SIGHUP to pid2 Process.kill("SIGHUP",pid2)  # Send SIGHUP to pid3 Process.kill("SIGHUP",0)     # Send SIGHUP to self

The Kernel.trap method can be used to handle such signals. It typically takes a signal number or name and a block to be executed.

 trap(1) {  puts "Caught signal 1" } sleep 2 Process.kill(1,0)  # Send to self

For advanced uses of trap, consult Ruby and Unix references.

The Process module also has methods for examining and setting such attributes as user ID, effective user ID, priority, and others. Consult any Ruby reference for details.

Manipulating Standard Input/Output

We've shown how IO.popen and IO.pipe work in Chapter 4. But there is a library we haven't mentioned that can prove handy at times.

The Open3.rb library contains a method popen3, which will return an array of three IO objects. These objects correspond to the standard input, standard output, and standard error for the process kicked off by the popen3 call. Here's an example:

 require "open3" filenames = %w[ file1 file2 this that another one_more ] inp, out, err = Open3.popen3("xargs", "ls", "-l") filenames.each {  |f| inp.puts f }    # Write to the process's stdin inp.close                           # Close is necessary! output = out.readlines              # Read from its stdout errout = err.readlines              # Also read from its stderr puts "Sent #{ filenames.size}  lines of input." puts "Got back #{ output.size}  lines from stdout" puts "and #{ errout.size}  lines from stderr."

This contrived little example does an ls -l on each of the specified filenames and captures the standard output and standard error separately. Note that the close is needed so that the subprocess will be aware that end of file has been reached.

For additional information refer to the section "The Shell Library."

Using system and exec

Command Output Substitution

Manipulating Processes

Manipulating Standard Input/Output

Using `system` and `exec`