What Else Does the Shell Do? | Solaris Operating Environment Boot Camp


Team-Fly

	Solaris™ Operating Environment Boot Camp By David Rhodes, Dominic Butler
	Table of Contents

	Chapter 5. Shells

When you type a command in at the shell prompt, the shell will analyze the line you have typed and make changes to it before it actually executes it. In actual fact, the shell will read each line a number of times before deciding it is happy with it.

Before we go through the main actions undertaken by the shell, it would make sense to ensure that you are familiar with how UNIX handles reading from and writing to files. This is one of the strong features of UNIX and has contributed to its success and longevity.

File Redirection

For a process to read from a file or write to a file, it must first be opened. This is an action performed by the operating system so you won't actually need to do it yourself, but you do need to be aware of it. When the file is opened, Solaris gives it a file descriptor, which is a number. The numbers start at 0, and the lowest unused number will always be assigned. Once a file is open, it is accessed by the file descriptor rather than the name. The sequence of events is that the program will say to the operating system (or the Solaris kernel to be more precise): "Please will you open this file for me and when you've done it give me the file descriptor that you have assigned to it." From that moment on, every time the program wants to either read from or write to the file it will do so via the file descriptor. When the program has finished using the file it should ask the operating system to close it, which will in turn free up the file descriptor allocated to it so it can be used if another file is opened. If the program forgets to tidy up after itself by closing the files it had open, they will all be closed automatically when the program stops running. File descriptors are allocated per process rather than across all the running processes, so even though many processes will be using the same file descriptor number that does not mean they have the same file open.

When a user first logs in and the shell starts up, one of the first things the shell does is open three files. These special files are called standard input or stdin (which is actually attached to your keyboard), standard out or stdout (attached to your screen), and standard error or stderr (also attached to your screen). Because they are always opened in this order, stdin gets file descriptor 0, stdout gets 1, and stderr gets 2. This is the same with all shells so all UNIX commands that expect to read from the keyboard have, in fact, been written so they read from file descriptor 0. Likewise, if they write output to the screen they use file descriptor 1 or 2, depending on whether it is normal output or an error message.

In general, each UNIX command is written to perform one specific task. In doing so, it will generally read some data from somewhere, process this data in some way, and output the result to somewhere. The place where the data comes from is usually a file, as is the place the data is sent.

A simple example of this is the cat command, which is commonly used to display the contents of a file:

 $ cat testfile This is the first line of the test file This is the second line of the test file This is the third line of the test file $

Here the command reads its input from the file testfile, does absolutely nothing to change it, and then writes the unchanged data to a file. It just happens that the file it writes the output to is the one with file descriptor 1 (your screen).

Because our screen and keyboard can be treated like files, it is a fairly simple task for the shell to enable us to redirect output that was intended to go to our screen into a file and in the same way redirect input that would normally come from our keyboard to come from a file (and vice versa). The general rule is that if a command reads its input from the keyboard, you can also make it read that input from a file. Likewise, if it sends output to the screen you can make this output go to a file instead. Like wildcards, this is something you may think is handled by the commands themselves, but it is actually handled by the shell. It is achieved by manipulating the file descriptors mentioned above.

Output Redirection

UNIX commands that send their output to the screen have all been written to write this output to file descriptor 1 (stdout); as we saw above, when the shell starts up, file descriptor 1 is always attached to your screen so everything works as expected. If we would like the output of a command to appear in a file instead, all we need to do is assign file descriptor 1 to that file and the output from the command will appear there instead of the screen. Because Solaris always gives a file the lowest available file descriptor when it is opened, all we need to do is close the screen (which will free up file descriptor 1) and then open any other file we choose. That file will be guaranteed to get file descriptor 1; as the command writes its output to file descriptor 1, it will end up in that file. We saw earlier that the lowest file descriptor is 0, but that cannot be assigned when we redirect the standard output because it is already assigned to the standard input.

To tell the shell that we want to redirect the output from a command, we use the file descriptor we wish to redirect followed by the "greater than" symbol (>).

 $ who 1>fred $ cat fred sysadmin   pts/0       Feb 18 14:12    (helium) jgreen     pts/1       Feb 18 14:15    (carbon) $

In the above example, we ran the who command and redirected the output to a file called fred. The result is that we don't see any output at all on the screen, and when we look inside fred we see output that the who command would normally have displayed on the screen.

The way this works is that the shell sees that we want to redirect standard output (because it sees the "1>" characters), so it closes the file that did have file descriptor 1 and then opens the file fred. Fred is assigned the lowest available file descriptor (which is 1); this construct tells the shell to close file descriptor 1 and open fred before executing the who command. When the command finishes running, fred is closed and file descriptor 1 is once more associated with the screen.

If the file fred did not exist when the above command was typed, it would be created. If it already existed, then it would be emptied before the command ran. If you wish a command's output to be appended to a file, then you should use ">>" rather than ">":

 $ date 1>>fred $ cat fred sysadmin   pts/0        Feb 18 14:12    (birka) jgreen     pts/1        Feb 18 14:15    (junibacken) Sunday February 18 17:11:47 GMT 2001 $

If any of the above commands had produced an error message it would not have appeared in the file we were redirecting to, but would have gone to the screen as usual. This is because all UNIX commands write their error messages to file descriptor 2, not file descriptor 1. (Remember, both these are normally attached to your screen, so you would normally see both standard output and error output on your screen.)

If you want to redirect the error messages to a file, I'm sure you've already worked out that you follow the same procedure as above but replace the "1" with a "2":

 $ find / -name passwd -print 2>find.errors /usr/bin/passwd /var/adm/passwd /etc/default/passwd /etc/passwd $

The find command has found four files named passwd and no error messages have gone to the screen. To see if any have gone to our file we can simply view its contents with cat:

 $ cat find.errors find: cannot read dir /lost+found: Permission denied find: cannot read dir /usr/lost+found: Permission denied find: cannot read dir /var/lost+found: Permission denied find: cannot read dir /var/spool/lp/tmp: Permission denied find: cannot read dir /var/spool/mqueue: Permission denied find: cannot read dir /var/crash/skansen: Permission denied find: cannot read dir /var/dt/sdtlogin: Permission denied find: cannot read dir /opt/lost+found: Permission denied <lines removed for clarity> $

The error file contains many lines telling us all the places we do not have permission to look in. If we hadn't redirected these messages, then it would have been very hard to see the output we wanted since it would have been mixed up with all the errors. In fact, whenever a non-root user performs a find from the root directory, there will always be several pages of error messages. We now know how to stop them from appearing on our screen, but we don't really want them sitting in files on the hard disk. Consequently, it is common practice to redirect unwanted output to a special file called /dev/null:

 $ find / -name passwd -print 2>/dev/null $

This file is special, as it always remains empty regardless of how much data is written to it.

If you do want to keep the error messages and you wish them to be redirected to the same file as the standard output, then the common method of doing this is as follows:

 $ find / -name passwd -print 1>find.ouput 2>&1 $

In term of redirection, the ampersand can be thought of as meaning "the same as." So, in the above we are saying "send file descriptor 1 output to the file find.output and send the file descriptor 2 output to the same file as file descriptor 1."

If you have used redirection before, you may be slightly puzzled since you are unlikely to have ever used 1>. This is because the default output stream to redirect is standard output, so if you don't specify the number it defaults to 1. Thus, all the above examples would be valid with all the 1s removed; this is how it normally is in the real world (why type something you don't need to?).

Input Redirection

Input redirection is often a harder concept to grasp than output redirection, but it is basically the same principle.

We have seen that any UNIX command that sends its output to the screen can have it redirected to a file, but it is also true that any command that reads its input from the keyboard can be told to read that input from a file instead. A command that is suitable for input redirection is the mail command:

 $ mail jgreen <memo.text

This will send a mail message to the user jgreen, but instead of mail expecting you to type the mail message at the keyboard it reads the full contents of the file memo.text instead.

The command could have been written mail jgreen 0<memo.text to show that it is file descriptor 0 that is being redirected, but this is the default so is not required. The procedure is much the same as with output redirection. The shell closes the standard input file, thus freeing up file descriptor 0, and then opens the file following the "less than" symbol (<). This file is then given the lowest free file descriptor (0), and since the mail command reads its input from file descriptor 0, it is none-the-wiser that anything different has even happened. Commands such as who, date, and ls do not read their input from standard input so are not suitable for input redirection. The following is a partial list of commands that do read standard input. In some cases, the command will read from a file if it is supplied as an argument and standard input if not (e.g., cat):

mail
cat
more
pg
cpio
lp

Pipelines

We have seen that you can send the output of a command to a file and you can also make a command read its input from a file. Sometimes you may want these files to be the same one file:

 $ find / -name "a*" -print >a-files 2>/dev/null $

The output file is likely to be very large so we may choose to view it with the pg command:

 $ pg a-files <output not shown for clarity> $

We could actually achieve the same result in a single line and cut out the middle man:

 $ find / -name "a*" -print 2>/dev/null | pg <output not shown for clarity> $

Here we have used a pipe symbol (|) to show that we want the standard output from one command to become the standard input to the next command following the pipe. We still don't want to see the error messages so we still redirect standard error (stderr) to /dev/null.

This is again handled by the shell so the commands themselves do not need to know what a pipe is. The command on the left of the pipe sends its output into one end of the pipe and the command to the right of the pipe reads the data going down the pipe as its own input.

The rules for using pipes are that any command on the left of a pipe must be one that writes its output to standard output, and any command to the right of a pipe must read its input from standard input. As long as you follow this rule, you can have as many commands connected this way as you like.

And the Rest

We have seen how the shell enables redirection. Now we'll see a few more things that the shell does with your command lines.

Each time you hit <return> while typing into the shell, the shell will scan the line you typed several times before it actually goes ahead and executes what you typed. In fact, the shell is reading data from the file with descriptor 0, processing it, and producing output to the screen.

The scans include the following checks and actions:

First the shell will check if the first word is contained on its list of aliases. If so, it is replaced with the value it is aliased to.
Next the shell will look for any unquoted words that begin with a "~." If any are found, the word following the "~" is expected to be a user name and it will be replaced by the full path of that user's home directory.
The shell will now look to see if the string "$(" exists. If it does, and there is a matching ")", then a subshell is spawned and the contents of the $() construct are executed within it. The output of this command will replace the original line in place of the entire $() contruct.
After dealing with $() the shell looks to see if you have used $(()) in your line. If it is found, the contents are expected to be an arithmetic expression. The expression will be evaluated and the result put in place of the $(()) construct.
Then it looks for an equal sign within the first word. If it finds one, a variable is created and the word is removed.
Now it looks for words that begin with a dollar sign. These are assumed to be variables and are replaced by the value assigned to that variable (or null if the variable does not exist).
At this stage the shell will look for any wildcards on the line (*?[]). If any are found, they will be replaced by the files they match.
Next the shell looks for redirection symbols (<, >, and 2>). For each one found, the appropriate file is closed (stdin, stdout, or stderr) and the filename to the right of the symbol is opened, thus producing the input or receiving the output.
One of the final steps the shell performs is checking for completeness. If you have failed to close any quotes or have used any built-in shell constructs (such as while) the shell will assume you want to carry on and will display the secondary prompt ($PS2) and expect you to carry on typing the rest of your command. This will, of course, also be parsed in the same way from the first step above. When the shell feels the command is complete it will be executed.

During the above process any special characters preceded by a backslash (\) will be left untouched, as will any within single quotes (' '). If the shell finds double quotes it will ignore some special characters (such as wildcards) but will still process shell variables.

Once the shell has completed messing around with your command line, it will often bear little resemblance to what you actually typed in. This way of operating does, however, mean that when you write a UNIX command (or shell script) you do not need to worry about how redirection or wildcards work as the shell handles it all for you. The shell will now execute your command and, when complete, respond with the shell prompt ($PS1) ready for you to type in the next command.

The above list shows the main steps taken when the Korn Shell parses your commands. If you want to know the complete list of steps, this is documented in great detail in the ksh man page, which is a light read at only 62 pages long!

If you run a Korn Shell with the "-x" option, the shell will display each line you have typed (or each line of a shell script) after the shell has finished parsing it. This is very useful for debugging shell scripts.


Team-Fly

Top