So far, this chapter has explored techniques that facilitate working with Bash. This section explores the syntax of the command line, including details of what makes Bash such an expressive interface to the operating system. This section also contains several building block concepts that you will use when you program using the shell in the last section.
Often, you want to deal with not one but several files at a time; for example, perhaps you want to copy all the files in a directory to another directory, or you want to copy just a group of files with a certain extension into another directory. To do this, you need a way to address files (and directories) as a group , based on certain characteristics of their names . This is where file globbing comes in. File globbing is a loose term that describes the technique of grouping files together using simple wildcard or metacharacter expressions.
Three metacharacters are available:
The asterisk ( * ) represents zero or more matches of any characters .
The question mark ( ? ) represents a match of exactly one character.
The square brackets ( [ ] ) match any one of the characters between them. The characters to match may also be specified as a hyphen-separated range of characters to match against.
You can also use the exclamation mark ( ! ). When you apply ! to an expression, it indicates that you are looking for the complement of the expression match resulting from the parentheses metacharacters (that is, all results except those that match the expression).
Perhaps the easiest way to understand the type of things that file globbing allows you to do is to look at an example. So, in this example you create a number of files and then use different globbing expressions to select different subsets of the files for listing.
Create a temporary folder called numberfiles , and then set it to be the current working directory:
$ mkdir /home/<username>/numberfiles $ cd /home/<username>/numberfiles
Now create ten files, named after the Italian words for the numbers 1 to 10. Use the touch command to do this:
$ touch uno due tre quattro cinque sei sette otto nove dieci
Use the ls command just to list them all (by default, it lists them by name in alphabetical order):
$ ls cinque dieci due nove otto quattro sei sette tre uno
Now let s group them in a number of different ways, using the metacharacters. First, list all the files that start with the letter s :
$ ls s* sei sette
Next, use the ? metacharacter to select all files with three-character filenames:
$ ls ??? due sei tre uno
Next, select all the files that have vowels starting their filename:
$ ls [aeiou]* otto uno
Next, select all the files that have any character in the range a to f beginning their filename:
$ ls [a-f]*
Finally, select all the files that do not have a vowel as the start of their filename. The exclamation operator must be within the square parentheses.
$ ls [!aeiou]* cinque dieci due nove quattro sei sette tre
How it works
We ve used the ls command here to demonstrate file globbing because the output from ls shows the effects of the globbing very clearly. However, you should note that you can use file globbing with any command that expects filename or directory name arguments. Let s look at each of the globbing expressions here.
We used the expression s* to match all files that begin with the letter s :
$ ls s*
This expression matches the file names sei and sette , and would even match a file called s if there were one because the * matches any string of any length (including the 0-length string).
To match filenames with exactly three characters, you use a ? to represent each character:
$ ls ???
We used the expression [aeiou]* to pick up all filenames starting with a vowel. The * works in the same way as in the s* example, matching any string of any length, so files matching this expression begin with a character a , e , i , o , or u , followed by any other sequence of characters:
$ ls [aeiou]*
A similar approach applies for the expression [a-f]* , except that you use a hyphen ( - ) within the parentheses to express any one of the characters in a range:
$ ls [a-f]*
Using a range implies that the characters have an assumed order. In fact, this range encompasses all alphanumeric characters, with numbers (0 “9) preceding letters (a “z). (Hence the expression [0-z]* would match all filenames that start with either a number or a letter.)
Finally, you use the exclamation mark ( ! ) within the square parentheses to negate the result of the vowel-matching expression, thereby arriving at all filenames that start with a consonant:
$ ls [!aeiou]*
Over the course of the chapter, you ve seen that the shell uses several special characters. Sometimes, you need to use these special characters literally. In this situation, you use quotation characters to protect these special characters from being interpreted by the shell or the shell script.
You often use single quote ( ' ) characters to protect a string:
$ touch 'foo*bar'
This creates a file called foo*bar on the disk. Without the single quotes, the * character would have been interpreted as a wildcard metacharacter.
You use double quote characters when referencing variables . All characters, including \ and ' , are interpreted literally except for the dollar sign ( $ ), which is used to refer to the value of a variable:
$ foo="foo/'" $ bar="'\bar" $ echo "$foo$bar" foo/''\bar
The echo command prints its arguments to the standard output. The echo command is discussed again in some detail in the System-defined Variables and User -Defined Variables section of this chapter.
The double quotes protected the single quotes and the slashes (both forward and backslashes) when the strings were assigned to variables foo and bar . As expected, when $foo and $bar are enclosed in double quotes in the last command, the $ is interpreted, and the two variables expanded to their values.
The backquote ( ` ) is used to execute commands. The backquote is convenient when the output of a certain command needs to be assigned to a variable:
$ datevar=`date` $ echo $datevar
You end up with the following:
Tue Jan 14 23:03:43 2003
In the first command, the datevar variable is assigned the output of the date command because it is executed within backquotes. The echo commands prints the value of the datevar variable.
Aliases are your first step toward customizing Bash. In its simplest form, an alias functions as an abbreviation for a commonly used command. In more complex cases, aliases can define completely new functionality. An alias is easily defined using the notation < alias_name > = < alias_value >. When you need it, you invoke it using < alias_name > ”the shell substitutes < alias_name > with < alias_value >.
In fact, the standard Fedora Core 2 shell already has several aliases defined. You can list the existing aliases using the alias command:
$ alias alias l.='ls -d .* --color=tty' alias ll='ls -l --color=tty' alias ls='ls --color=tty' alias vi='vim' alias which='alias /usr/bin/which --tty-only --read-alias --show-dot --show-tilde'
Some of the common aliases include aliases for the ls command, to include your favorite options. If you use the ls command without any options, it simply prints the list of files and subdirectories under the current working directory. However, in this case, the ls command is aliased to itself, with the --color option, which allows ls to indicate different file types with different colors.
Aliases may be defined for the lifetime of a shell by specifying the alias mapping at the command line or in a startup file (discussed in the System-Defined Variables and User-Defined Variables section) so that the aliases are available every time the shell starts up.
Just as it is possible to set an alias as a synonym to a command with options and arguments, it is possible to disassociate the synonym. This is achieved using the unalias command:
$ alias alias vi='vim' alias l.='ls -d .* --color=tty' alias ll='ls -l --color=tty' alias ls='ls --color=tty' $ unalias vi $ alias alias l.='ls -d .* --color=tty' alias ll='ls -l --color=tty' alias ls='ls --color=tty'
Notice that on executing the unalias command, vi is no longer associated with vim .
Like aliases, environment variables are name-value pairs that are defined either on the shell prompt or in startup files. A process may also set its own environment variables programmatically (that is, from within the program, rather than declared in a file or as arguments).
Environment variables are most often used either by the shell or by other programs to communicate settings. Some programs communicate information through environment variables to programs that they spawn. There are several environment variables set for you in advance. To list all of them that are currently set, you can use the env command, which should display an output similar to the following:
$ env HOSTNAME=localhost.localdomain SHELL=/bin/bash TERM=xterm HISTSIZE=1000 USER=deepakt MAIL=/var/spool/mail/deepakt PATH=/usr/local/bin:/usr/bin:/bin:/usr/X11R6/bin:/home/deepakt/bin ...
As you can see, the PATH variable is one of the environment variables listed here. As described earlier in this chapter, Bash uses the value of the PATH variable to search for commands. The MAIL variable, also listed here, is used by mail-reading software to determine the location of a user s mailbox.
You may set your own environment variables or modify existing ones:
$ echo $PATH PATH=/usr/local/bin:/usr/bin:/bin:/usr/X11R6/bin:/home/deepakt/bin $ export MYHOME=/home/deepakt $ export PATH=$PATH:$MYHOME/mybin $ echo $PATH PATH=/usr/local/bin:/usr/bin:/bin:/usr/X11R6/bin:/home/deepakt/bin:/home/deepakt/mybin
Whereas user-defined variables (also known as local variables) can be set as MYHOME=/home/deepakt , these variables will not be available to any of the commands spawned by this shell. For local variables to be available to child processes spawned by a process (the shell in this case), you need to use the export command. However, to achieve persistence for variables even after you log out and log back in, you must save these settings in startup files.
Environment variables are defined either interactively or in a startup file such as .bashrc . These variables are automatically made available to a new shell. Examples of environment variables are PATH , PRINTER , and DISPLAY .
However, local variables do not get automatically propagated to a new shell when it is created. The MYHOME variable is an example of a local variable.
The echo command, followed by the name of a variable prefixed with a dollar ( $ ) symbol, prints the value of the environment variable.
Earlier in the chapter, you saw how you can use filesystem commands to manipulate files and directories and process management commands such as ps to manage processes. The shell provides you with a powerful set of operators that allow you to manage input, output, and errors while working with files and processes.
If a process needs to perform any I/O operation, it has to happen through an abstraction known as an I/O stream . The process has three streams associated with it: standard input, standard output, and standard error. The process may read input from its standard input, write its output to standard output, and write error messages to its standard error stream.
By default, the standard input is associated with the keyboard; output and error are associated with the terminal, in our case, mostly an xterm . Sometimes, you may not want processes to write to or read from a terminal; you may want the process to write to another location, such as a file. In this case, you need to associate the process s standard output (and possibly the standard error) with the file in question. The process is oblivious to this, and continues to read from the standard input and write to the standard output, which in this case happens to be the files you specify. The I/O redirection operators of the shell make this redirection of the streams from the terminal to files extremely simple.
The < operator allows programs that read from the standard input to read input from a file. For example, consider the wc (word count) program, which reads input from the keyboard (until a Ctrl+D is encountered ) and then prints the number of lines, words, and characters that were input:
$ wc l 12345 67890 12345 ^D 3
Note that the “l option is used here, which has wc print the number of lines only.
Now consider a case in which you have the input to wc available in a file, called 3linefile.txt . In this case, the following command will produce the same result:
$ wc l < 3linefile.txt 3
Here, the standard input is redirected from the keyboard to the file.
The > operator is similar to the < operator. Its purpose is to redirect the standard output from the terminal to a file. Consider the following example:
$ date > date.txt
The date command writes its output to the standard output, which is usually the terminal. Here, the > operator indicates to the shell that the output should instead be redirected to a file.
When you write the file out to the terminal (using the cat command), you can see the output of the date command displayed:
$ cat date.txt Tue Jan 14 23:03:43 2003
Based on what you have learned so far, create a file with some contents in it:
$ cat > test.txt The quick brown fox jumped over the rubber chicken ^D $ cat test.txt The quick brown fox jumped over the rubber chicken
Using cat to create a file in this way is similar to using the Microsoft DOS command COPY CON TEST.TXT .
How it works
The cat command, used without any options, is supposed to echo back to the standard output anything that it reads from the standard input. In this case, the > operator redirects the standard output of the cat command to the file test.txt . Whatever was typed on the keyboard (standard input) ends up in the file test.txt (standard output redirected by the shell).
The >> operator is essentially the same as the > operator. The only difference is that it does not overwrite an existing file; instead, it appends to it.
$ cat >> test.txt Since rubber chicken makes bad nuggets ^D $ cat test.txt The quick brown fox jumped over the rubber chicken Since rubber chicken makes bad nuggets
The operator is used to feed the output of one command to the input of another command.
$ cat test.txt wc l 2 $wc l test.txt 2
The output of the cat command ”that is, the contents of the file test.txt ”is fed by the shell to the wc command. It is the equivalent of running the wc “l command against the test.txt file. It is also possible to chain multiple commands this way (for example command1 command2 command3 ).
Often, you may want a program to output both its standard output and error to one file. Typically, this is a requirement when a program has to run unattended, as is the case with programs automatically executed on a schedule, such as a backup program (see the section Scheduling Tasks later in this chapter to learn more about scheduling programs to run unattended). You would want to capture the output and error statements of the backup program in one file.
$ mybackup r allnight >/tmp/mybackup.out 2>&1
Here, the fictitious program mybackup redirects its standard output to the file /tmp/mybackup.out due to the > operator. The 2 >& 1 notation indicates that 2, that is, the standard error, is dup -ed to 1, the standard output. This causes the standard output and standard error to be written to the same file.
The xargs command has no life of its own, but it acts as a mechanism to dynamically complete arguments for another command. Look at the following example, which uses the command sequences to find all files with the string needle in a hierarchy of directories.
$ ls R .grep needle Thimble and needle in hand, undaunted they The space needle dominated the bleary landscape
Unfortunately, you don t know if both of these lines are from the same file. Even worse , you don t know which files they came from. The xargs command to the rescue:
$ ls R .xargs grep needle sewing101: Thimble and needle in hand, undaunted they seattledaily: The space needle dominated the bleary landscape
Here the xargs command acts as an argument completion mechanism for the grep command. Each line of output from the ls command (that is, individual filenames) is concatenated by xargs into a set of arguments to the grep command. The grep command, on seeing multiple files to search for, prints the filename along with the occurrence of a matching string in a file.