CONTENTS |
Practical Extraction and Reporting Language (Perl) is one of the most versatile programming languages. It is used for a variety of reasons including the following: UNIX and Windows system administration scripts, Common Gateway Interface (CGI), database access, file processing, and many others.
Perl possesses some of the best features of shell programming: sed and awk. In addition, Perl scripts are usually faster than shell programs and more portable.
Perl runs on a variety of platforms including all UNIX variants, Linux (the platform on which the examples in this chapter were run), Macintosh, Windows, and many others. I have been running Perl on my Windows-based notebook computer for some time. I write the programs on my notebook and run them on UNIX systems. Perl is highly portable in this respect, which is one of its most desireable qualities. There is a wealth of material on the Internet related to all aspects of Perl. Two of the most useful Web sites are www.tpj.com (The Perl Journal) and www.perl.com (general Perl information and downloading of Perl).
Perl is free on some Web sites. You may want to buy many of the extensions of Perl that are available; however, Perl itself is free and is easily downloaded onto almost any platform.
Perl is an interpreted language meaning that Perl programs are not compiled but rather interpreted, as are shell programs. Perl programs are also fully parsed by the interpreter before they are run. This means that syntax errors in the program will be reported before the program is run, unlike a shell program which can fail at any point during execution of the program with syntax errors.
We are only going to scratch the surface of Perl functionality in this chapter. An entire Perl community exists that includes many web sites, books, training, and other material. I could never cover all of the vast functionality that is available with Perl, one chapter. This chapter will provide enough of the basics to get started with Perl by writing some simple Perl scripts and will give you an appreciation for it as a powerful programming language.
Before reading this Perl chapter, you may want to take a quick look at the regular expressions covered in Chapter 6. A regular expression usually defines the pattern for which you are searching, using wildcards. Since a regular expression defines a pattern you are searching for, the terms "regular expression" and "pattern matching" are often used interchangably. This is not true at all. The following bullets help clarify regular expressions:
Regular expressions are different from file-matching patterns used by the shell. Regular expressions are used by both the shell and many other programs, including Perl. The file matching done by the shell and programs such as find isdifferent from the regular expressions used in this chapter.
Use single quotes around regular expressions. The meta-characters used in this chapter must be quoted in order to be passed to the shell as an argument. You will, therefore, see most regular expressions in this chapter quoted.
When using many programs such as perl, sed, awk, vi, and grep you provide a regular expression that the program evaluates. The command will search for the pattern you supply. The pattern could be as simple as a string or it could be wildcards. The wildcards used by many programs are called meta-characters. The Perl that was included in my distribution of Linux has a man page called perlre that contains a thorough discussion of regular expressions. There were many other Perl-related man pages that came with my Perl. For instance, a "bag of tricks" for Perl manual page is perlbot, the frequently asked questions are in perlfaq, and so on.
There are many options associated with Perl that you'll want to review before you run a Perl script. For instance, you can run a one line Perl program at the command line or write a longer program that you run. You can also check the syntax of the program without running it. The example below shows the options to Perl on a Linux system that was produced by running perl -h, the -h standing for help.
# perl -h Usage: perl [switches] [--] [programfile] [arguments] -0[octal] specify record separator (\0, if no argument) -a autosplit mode with -n or -p (splits $_ into @F) -c check syntax only (runs BEGIN and END blocks) -d[:debugger] run scripts under debugger -D[number/list] set debugging flags (argument is a bit mask or flags) -e 'command' one line of script. Several -e's allowed. Omit [programfile]. -F/pattern/ split() pattern for autosplit (-a). The //'s are optional. -i[extension] edit <> files in place (make backup if extension supplied) -Idirectory specify @INC/#include directory (may be used more than once) -l[octal] enable line ending processing, specifies line terminator -[mM][-]module.. executes 'use/no module...' before executing your script. -n assume 'while (<>) { ... }' loop around your script -p assume loop like -n but print line also like sed -P run script through C preprocessor #
As you embark on writing Perl programs you will undoubtedly end up using some of these options. As your programs become more complex, you will probably use the -c option to check the syntax of the program.
In the next section we'll begin writing some simple Perl programs.
Perl is so versatile that it can be used for almost any programming purpose. One of the most common uses of Perl is to read the contents of a file and manipulate it in some way. We'll do just that in our upcoming example.
We'll begin our example by opening the /etc/passwd file on a system and print its contents, line-by-line, to standard output (STDOUT). The following is a listing of our example script called test.pl:
#!/usr/bin/perl print "\n Work with passwd file \n \n"; open (password_file, '/etc/passwd') || die ("passwd file unavailable \n"); @line = <password_file>; print (@line);
This simple program contains many important concepts in Perl. The first line contains the path of Perl. This varies depending on the UNIX variant that you are using and the distribution of Perl which you are using. In most cases, however, you'll find Perl at /usr/bin/ perl. The second line is a message printed to the screen when the program is run. There is one newline (\n) printed before the message is printed and two newlines printed after the message is printed. Next we open the file /etc/passwd and give it a handle of passwd_file. We also die if we are unable to open the file and print the message passwd file unavailable. The die function terminates a program and the warn function, used in a program later in this chapter, prints out the message you specify. Next, we use an array called @line to store all of the lines in /etc/passwd. Finally, we print all of the elements in the array @line to STDOUT. The @ preceeding the variable indicates that this is an array.
The result of running test.pl is to send the contents of /etc/ passwd to STDOUT, which in this case is our terminal window, as shown in the following example:
#./test.pl Work with /etc/passwd file on Linux system root:x:0:0:root:/root:/bin/bash bin:x:1:1:bin:/bin: daemon:x:2:2:daemon:/sbin: adm:x:3:4:adm:/var/adm: lp:x:4:7:lp:/var/spool/lpd: sync:x:5:0:sync:/sbin:/bin/sync shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown halt:x:7:0:halt:/sbin:/sbin/halt mail:x:8:12:mail:/var/spool/mail: news:x:9:13:news:/var/spool/news: uucp:x:10:14:uucp:/var/spool/uucp: operator:x:11:0:operator:/root: games:x:12:100:games:/usr/games: gopher:x:13:30:gopher:/usr/lib/gopher-data: ftp:x:14:50:FTP User:/home/ftp: nobody:x:99:99:Nobody:/: xfs:x:100:102:X Font Server:/etc/X11/fs:/bin/false gdm:x:42:42::/home/gdm:/bin/bash postgres:x:101:233:PostgreSQL Server:/var/lib/pgsql:/bin/ bash squid:x:102:234::/var/spool/squid:/dev/null martyp::500:500::/home/martyp:/bin/bash hp:x:501:501::/home/hp:/bin/bash test:x:502:502::/home/test:/bin/bash #
Notice that we have run this program from a directory that is not in our current path because the "./" was required preceeding the name of test.pl. Alternatively, we could have added the current directory to our path. We'll use./ when we run programs throughout this chapter.
We can also redirect the output of our program to a file rather than to STDOUT. The following listing shows our program test.pl modified to redirect STDOUT to the file passwd.out in our current directory:
# cat test.pl # #!/usr/bin/perl print "\n Work with /etc/passwd file on Linux system \n \n"; open (passwd_file, '/etc/passwd') || die ("passwd file unavailable \n"); open (STDOUT, ">passwd.out") || die ("couldn't open passwd.out \n"); @line = <passwd_file>; print (@line); #./test.pl Work with /etc/passwd file on Linux system # cat passwd.out root:x:0:0:root:/root:/bin/bash bin:x:1:1:bin:/bin: daemon:x:2:2:daemon:/sbin: adm:x:3:4:adm:/var/adm: lp:x:4:7:lp:/var/spool/lpd: sync:x:5:0:sync:/sbin:/bin/sync shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown halt:x:7:0:halt:/sbin:/sbin/halt mail:x:8:12:mail:/var/spool/mail: news:x:9:13:news:/var/spool/news: uucp:x:10:14:uucp:/var/spool/uucp: operator:x:11:0:operator:/root: games:x:12:100:games:/usr/games: gopher:x:13:30:gopher:/usr/lib/gopher-data: ftp:x:14:50:FTP User:/home/ftp: nobody:x:99:99:Nobody:/: xfs:x:100:102:X Font Server:/etc/X11/fs:/bin/false gdm:x:42:42::/home/gdm:/bin/bash postgres:x:101:233:PostgreSQL Server:/var/lib/pgsql:/bin/bash squid:x:102:234::/var/spool/squid:/dev/null martyp::500:500::/home/martyp:/bin/bash hp:x:501:501::/home/hp:/bin/bash test:x:502:502::/home/test:/bin/bash #
The first part of the output shows the contents of our program test.pl. The next part of the listing shows running test.pl. Note that the message "Work with /etc/passwd file on Linux system" still goes to STDOUT. The final part of the listing shows the contents of the file passwd.out produced as a result of having run test.pl.
Early in this program, we test to see if the file /etc/passwd exists. We attempt to open the file and, if we are unable to do so, the die is executed. You can perform many tests on files such as checking to see if a file exists (-e) as shown in the following line:
if (-e "/etc/passwd"){ print "/etc/passwd exists \n"; } else{ print "/etc/passwd does not exist \n"; }
The -e used in association with /etc/passwd will evaluate as true if /etc/passwd exists and false if it does not exist. The appropriate print will then be executed (we'll get to the if and else in an upcoming section). There are many file test operators in addition to -e. Table 26-1 shows some of the most commonly used file test operators:
File Test Operator | Description - The following are true if the item being evaluated ... |
---|---|
-b | is a block device. |
-c | is a character device. |
-d | is a directory. |
-e | exists. |
-f | is an ordinary file. |
-g | has a setgid. |
-k | has the sticky bit set. |
-l | is a symbolic link. |
-o | is owned by the current user (UID). |
-p | is a named pipe (FIFO). |
-r | is readable. |
-s | contains any information, size is non-zero. |
-t | represents a terminal. |
-u | has a setuid. |
-w | is writable by effective UID and GID. |
-x | is executable. |
-z | is empty. |
-A | has been accessed since date. |
-B | is a binary file. |
-C | has an inode that has been accessed since date. |
-M | has been modified since date. |
-O | is owned by current user. |
-R | is readable by current user. |
-S | is a socket. |
-T | is a text file. |
-W | is writable by current user. |
-X | is executable by current user. |
There are also lines in our program that print a newline (/n). This is a special character combination known as an escape sequence. Table 26-2 shows many commonly used escape sequences in Perl (these may vary somewhat depending on your Perl distribution).
Escape Sequence | Description |
---|---|
\a | Alert bell or beep |
\b | Backspace |
\cx | Conrol x |
\e | Escape |
\f | Form feed |
\l | Convert following character to lowercase |
\L | Convert all following characters to lowercase |
\n | Newline |
\r | Carraige return |
\t | tab |
\u | Convert following character to uppercase |
\U | Convert all following characters to uppercase |
\v | Vertical tab |
\xxx | Hex or octal value (i.e. \033 octal |x1B hex) |
Now that we've successfully opened a file for reading and writing, covered file test operators, and escape sequences, let's move on to variables. We have only touched on variables when we assigned @line to our array.
Now that we can perform the fundamental task of reading and writing files, we'll start to manipulate data related to the files.
We've already done some work with an array when we opened the /etc/passwd file and assigned its contents to an array. Let's now take a step back and evaluate variables more carefully.
To begin with, let's take a closer look at the array to which we've assigned the contents of the /etc/passwd file. We'll assign a scalar variable called $length to contain the number of elements in the array. A scalar variable is a single value. In this case, the single value of $length contains the number of elements in the array. The following shows the contents of the modified test.pl program that includes the line $length =@line and a line to print the value of $length:
#!/usr/bin/perl print "\n Work with /etc/passwd file on Linux system \n \n"; open (passwd_file, '/etc/passwd') || die ("passwd file unavailable \n"); open (STDOUT, ">passwd.out") || die ("couldn't open passwd.out \n"); @line = <passwd_file>; $length = @line; print "The length of the array containing the contents of / etc/passwd is $length \n"; #print (@line); # this line commented out.
Now we'll view the contents of passwd.out, which shows that the output of test.pl. passwd.out contains the number of lines in the / etc/passwd file. We'll also run the wc command to see the number of lines that are present in the /etc/passwd file:
# cat passwd.out The length of the array containing the contents of /etc/passwd is 23 # cat /etc/passwd | wc 23 27 810 #
This output shows that the length of the array ($length) produced from test.pl has a value of 23. This corresponds to the number of lines in the /etc/passwd file, as we were able to confirm by running wc against /etc/passwd. The first of the three fields from the output of wc is the number of lines in the file. This confirms that our scalar variable $length does indeed contain the number of entries in /etc/passwd.
Scalar variables could be of many types; however, you never have to specify the type of variable in Perl. An example of some scalar variables are:
$string = "output"; $number = 50; $number = 50.77; $length = @line; (this was used in our earlier example)
Since we have 23 entries in the /etc/passwd file and an array element corresponding to each entry, we can select a specific element in the array or multiple elements of the array. We'll add a statement to print @line[3], which is the fourth element in the array (the array starts at @line[0], therefore, 3 is the fourth entry), which is the fourth line in the /etc/passwd file. The following shows the listing of test.pl and the output file passwd.out:
# cat test.pl #!/usr/bin/perl print "\n Work with /etc/passwd file on Linux system \n \n"; open (passwd_file, '/etc/passwd') || die ("passwd file unavailable \n"); open (STDOUT, ">passwd.out") || die ("couldn't open passwd.out \n"); @line = <passwd_file>; $length = @line; print "The length of the array containing the contents of / etc/passwd is $length \n"; print $line[3]; #print (@line); # this line commented out. #./test.pl Work with /etc/passwd file on Linux system # cat passwd.out The length of the array containing the contents of /etc/passwd is 23 adm:x:3:4:adm:/var/adm: #
When we view passwd.out, the statement with the length of the array appears and is followed by the fourth line in /etc/passwd, which is contained in @line[3].
Note the difference between declaring an array and a scalar variable. Our array is declared with @line and the scalar is $length. The @ indicates and array and $ indicates a scalar. We could even give the scalar ($) and the array (@) the same name, but they would not be related. In the example below, the scalar and array called line but are not related:
$line = "scalar" @line = ("line","of","text"); print $line #contents of scalar variable line print $line[3] #third element of array line print @line #all elements of array line
We have both a scalar variable and an array with name line, but the two are unrelated.
Let's now expand our program to loop through more than one entry in the array. We'll expand test.pl to loop through entries zero through five and will print these to our output file. The following shows the updated program to accomplish this:
#!/usr/bin/perl print "\n Work with /etc/passwd file on Linux system \n \n"; open (passwd_file, '/etc/passwd') || die ("passwd file unavailable \n"); open (STDOUT, ">passwd.out") || die ("couldn't open passwd.out \n"); @line = <passwd_file>; $length = @line; print "The length of the array containing the contents of / etc/passwd is $length \n \n"; $lineno = 0; # set the line number to first array entry while ($lineno < 6){ # while loop to print first six entries (0-5) print "This is line number $lineno in /etc/passwd \n"; print $line[$lineno]; # prints the line from /etc/passwd $lineno++ # increment counter } #print (@line); # this line commented out
We have used a while loop in the program to run the block that prints the /etc/passwd entry a total of six times. This contents of passwd.out is shown below:
The length of the array containing the contents of /etc/passwd is 23 This is line number 0 in /etc/passwd root:x:0:0:root:/root:/bin/bash This is line number 1 in /etc/passwd bin:x:1:1:bin:/bin: This is line number 2 in /etc/passwd daemon:x:2:2:daemon:/sbin: This is line number 3 in /etc/passwd adm:x:3:4:adm:/var/adm: This is line number 4 in /etc/passwd lp:x:4:7:lp:/var/spool/lpd: This is line number 5 in /etc/passwd sync:x:5:0:sync:/sbin:/bin/sync
paswd.out shows line numbers 0-5 (which should really be line numbers 1-6) of /etc/passwd.
In addition to the while loop employed in our example, there are many other conditional statements you can use in Perl programs. The following are examples of templates for using conditional statements:
if-then-else
if (expression) { 'then' statement block } else { 'else' statement block }
unless-else
unless (expression) { 'unless' statement block } else { 'else' statement block }
if-then-elsif-else
if (expression) { 'then' statement block } elsif (expression) { 'elsif' statement block } else { 'else' statement block {
while
while (expression) { 'while' statement block }
until
until (expression) { 'until' statement block }
do-while-until
do { 'do' statement block } while (expression)
for
for (statement; conditional expression; iterator statement) { 'for' statement block }
foreach
foreach scalar variable (array variable) { 'foreach' statement block }
next (skip the rest of the block)
next;
last (terminates the loop)
next;
redo; (repeats an iteration of the loop)
redo;
These are standard programming conditional statements that you can employ in your Perl programs.
We also autoincremented the $lineno variable in our last update of test.pl. You can autoincrement and autodecrement in a variety of ways. Using our earier variable name of $lineno, the following are some of the most common methods of performing autoincrement and autodecrement:
$lineno = 5; # initially assign $lineno to five $lineno++; # increment $lineno by one $lineno-- # decrement $lineno by one $lineno+=2 # increment $lineno by two $lineno-=2 # decrement $lineno by two
There are many operators in addition to the autoincrement that we used in our example. Table 26-3 covers the most commonly used operators.
Operator | Description - The following perform ... |
---|---|
+ | addition. |
- | subtraction. |
* | multiplication. |
/ | division. |
< | less than (for numbers). |
> | greater than (for numbers). |
== | equal to (for numbers, these are two = signs). |
<= | less than or equal to (for numbers). |
>= | greater than or equal to (for numbers). |
!= | not equal to (for numbers). |
<=> | comparison (result is 1 if greater than, -1 if less than, 0 if equal to (for numbers)). |
lt | less than (for strings). |
gt | greater than (for strings). |
le | less than or equal to (for strings). |
ge | greater thanor equal to(for strings). |
eq | equal to (for strings). |
ne | not equal to (for strings). |
cmp | comparison (for strings). |
= | assignment of variable such as $a = $a + 5. |
+= | addition increment such as $a += 5. |
++ | increment by one, such as $a++. |
-= | subtraction increment, such as $a -= 5. |
-- | decrement by one, such as $a--. |
/= | division increment, such as $a /= 5. |
*= | multiplication increment, such as $a *= 5. |
**= | exponential increment, such as $a **= 5. |
lc or \L | conversion of string to lowercase. |
lcfirst or \l | conversion of first letter of string to lowercase. |
uc or \U | conversion of string to uppercase. |
ucfirst or \u | conversion of first letter of sting to uppercase. |
|| | logical or operation. |
&& | logical and operation. |
! | logical not operation. |
&& | bitwise and operation. |
| | bitwise or operation. |
^ | bitwise exclusive or operation. |
~ | bitwise not operation. |
<< >> | left shift and right shift operations. |
not | logical not. |
and | logical and. |
or | logical or. |
xor | logical xor. |
There are additional operators with some distributions of Perl, but this table gives you an idea of the vast number of operations you can perform.
You can pass arguments to Perl programs, or arguments to subroutines, using a special array called $ARGV. When you type the name of the Perl program, you type along arguments to be passed to the shell program. We'll include a loop in our test.pl program that looks like the following:
foreach (@ARGV) { print "$_\n" }
The arguments are in an array called $_[index], so when we print $_ in our Perl program all of the arguments in the array are sent to STDOUT. The following example shows our test.pl Perl program modified to include a loop that will print all arguments. Note that the lines related to redirecting STDOUT and the lines related to printing the first six entries in /etc/passwd are now commented. There is also an example run of the program following the listing:
# cat test.pl #!/usr/bin/perl print "\n Work with /etc/passwd file on Linux system \n \n"; open (passwd_file, '/etc/passwd') || die ("passwd file unavailable \n"); #open (STDOUT, ">passwd.out") || # die ("couldn't open passwd.out \n"); @line = <passwd_file>; $length = @line; print "The length of the array containing the contents of / etc/passwd is $length \n \n"; foreach (@ARGV) { print "$_\n" } #$lineno = 0; # set the line number to first array entry #while ($lineno < 6){ # while loop to print first six entries (0-5) # print "This is line number $lineno in /etc/passwd \n"; # print $line[$lineno]; # prints the line from /etc/passwd # $lineno++ # increment counter #} #print (@line); # this line commented #./test.pl a b c Work with /etc/passwd file on Linux system The length of the array containing the contents of /etc/passwd is 23 a b c #
Notice that when we run test.pl, we typed a b c as arguments. The a b c are sent to STDOUT as a result of our foreach loop.
Alternatively, we can prompt the user for two arguments to be used in the program rather than have the user issue the arguments when they run the program. Since the user will issue a newline along with the arguments, we'll use a command called chomp to remove the newlines and leave only the arguments, as shown in the following lines that will be added to our program:
print "\nEnter the name for which you want to search: "; chomp($search = <STDIN>); print "\nEnter the replacement name: "; chomp($replace = <STDOUT>); print "\n You want to search for $search and replace with $re place \n \n";
The following example shows our test.pl program modified to request the two arguments and an example run of the program:
# cat test.pl #!/usr/bin/perl print "\n Work with /etc/passwd file on Linux system \n \n"; open (passwd_file, '/etc/passwd') || die ("passwd file unavailable \n"); #open (STDOUT, ">passwd.out") || # die ("couldn't open passwd.out \n"); @line = <passwd_file>; $length = @line; print "The length of the array containing the contents of / etc/passwd is $length \n \n"; print "\nEnter the name for which you want to search: "; chomp($search = <STDIN>); print "\nEnter the replacement name: "; chomp($replace = <STDIN>); print "\n You want to search for $search and replace with $re place \n\n"; #$lineno = 0; # set the line number to first array entry #while ($lineno < 6){ # while loop to print first six entries (0-5) #print "This is line number $lineno in /etc/passwd \n"; #print $line[$lineno]; # prints the line from /etc/passwd #$lineno++ # increment counter #} #print (@line); # this line commented #./test.pl Work with /etc/passwd file on Linux system The length of the array containing the contents of /etc/passwd is 23 Enter the name for which you want to search: test Enter the replacement name: newtest You want to search for test and replace with newtest #
We have now covered two techniques for entering arguments to Perl programs - entering the arguments at the command line and prompting the user for the arguments. Now that we can enter these arguments, let's use them for searching and replacing. In the next , we'll add the search and replace functionality to our program.
Let's now include a line in test.pl to search for a search string and replace it with a replace string. You may want to refer to the regular expression discussion and table earlier in the book as part of the search and replace. When we search and replace in Perl, the patterns we use are regular expressions. We'll use s for the substitute command in our upcoming examples. If we wanted to replace the string test with the string newtest, we would use the following command in our Perl program:
s/test/NEWTEST/
This would result in an occurrance of test being replaced with NEWTEST. If we wanted all occurrances replaced, we would add the option g for global replacement.
The substitution is also case-sensitive. If we wanted to ignore case so that any combination of uppcase and lowercase occurrances of test were to be replaced with NEWTEST, we would use the i option for ignore case.
The following example shows a listing of our test.pl program in which we search and replace strings. We read the /etc/passwd file, prompt the user for both the string for which they want to search and its replacement, and then send the result to STDOUT:
# cat test.pl #!/usr/bin/perl print "\n Work with /etc/passwd file on Linux system \n \n"; open (passwd_file, '/etc/passwd') || die ("passwd file unavailable \n"); #open (STDOUT, ">passwd.out") || # die ("couldn't open passwd.out \n"); @line = <passwd_file>; $length = @line; print "The length of the array containing the contents of / etc/passwd is $length \n \n"; print "\nEnter the name for which you want to search: "; chomp($search = <STDIN>); print "\nEnter the replacement name: "; chomp($replace = <STDIN>); print "\n You want to search for $search and replace with $re place \n\n"; foreach (@line) { s/$search/$replace/ig; print " $_"; } #$lineno = 0; # set the line number to first array entry #while ($lineno < 6){ # while loop to print first six entries (0-5) # print "This is line number $lineno in /etc/passwd \n"; # print $line[$lineno]; # prints the line from /etc/passwd #$lineno++ # increment counter #} #print (@line); # this line commented #./test.pl Work with /etc/passwd file on Linux system The length of the array containing the contents of /etc/passwd is 23 Enter the name for which you want to search: test Enter the replacement name: NEWTEST root:x:0:0:root:/root:/bin/bash bin:x:1:1:bin:/bin: daemon:x:2:2:daemon:/sbin: adm:x:3:4:adm:/var/adm: lp:x:4:7:lp:/var/spool/lpd: sync:x:5:0:sync:/sbin:/bin/sync shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown halt:x:7:0:halt:/sbin:/sbin/halt mail:x:8:12:mail:/var/spool/mail: news:x:9:13:news:/var/spool/news: uucp:x:10:14:uucp:/var/spool/uucp: operator:x:11:0:operator:/root: games:x:12:100:games:/usr/games: gopher:x:13:30:gopher:/usr/lib/gopher-data: ftp:x:14:50:FTP User:/home/ftp: nobody:x:99:99:Nobody:/: xfs:x:100:102:X Font Server:/etc/X11/fs:/bin/false gdm:x:42:42::/home/gdm:/bin/bash postgres:x:101:233:PostgreSQL Server:/var/lib/pgsql:/bin/ bash squid:x:102:234::/var/spool/squid:/dev/null martyp::500:500::/home/martyp:/bin/bash hp:x:501:501::/home/hp:/bin/bash NEWTEST:x:502:502::/home/NEWTEST:/bin/bash
The very last line in the listing contains two occurrances of the string NEWTEST that replaced the original string test. If we had not used the global option in our search and replace, only the first test on the line would have been replaced with NEWTEST.
There are a variety of built-in operators to Perl that are used to manipulate lists. If you have an array that is a list and wish to manipulate it in some way, these operators are helpful. Let's look at one such operator, so you can see how it works, and then I'll list some other useful ones for you.
The split operator takes a series of strings that are separated by a common string and concatenates them. The /etc/passwd file used in earlier examples consists of a series of strings, all separated by ":". We'll create a short program to take one line in /etc/passwd and use split to remove the ":" and make it into one line. The following shows the first entry of /etc/passwd, our program to remove the ":" separating the entries, and the result of having run the program:
# head -1 /etc/passwd root:x:0:0:root:/root:/bin/bash # cat test.pl #!/usr/bin/perl open (passwd_file, '/etc/passwd') || die ("passwd file unavailable \n"); @line = <passwd_file>; @passwdsplit = split(/:/, @line[0]); print @passwdsplit; #./test.pl rootx00root/root/bin/bash
This program identified the ":" separating the string and created a list of entries. Perl has many such built-in list operators. Table 26-4 describes some of the list operators available in Perl
Operator | Description |
---|---|
join | $string = join ("separator", @array0; Join elements of array into scalar string. |
split | @array = split("separator", $string); Split a string into a list that can be stored in an array variable. |
push and pop | push (@array, 1, 2, 3); Add the three items 1, 2, and 3 to the end of the list in @array. The pop operator removes the last element from the list. |
sort | Sort a list in ascending ASCII order. |
reverse | Reverse the order of elements in a list. |
shift | Remove the first element of list and shift remaining element(s) to left. |
unshift | Add element to the front of list and shift everything else to right. |
grep | Extract elements of list that match specified pattern. |
splice | Cutan array into pieces. |
Table 20-4 shows some of the more commonly used list operators. If you plan on working extensively with lists in Perl, you'll want to get additional information on these operators.
Subroutines, or functions, as they are known in some other programming languages, are easy to implement in Perl. You would typically call a subroutine and pass some number of arguments to it, run the subroutine code, and then return data to the main program.
Let's modify our Perl program test.pl by adding a subroutine to it that checks to see if the file to which we want to write our output exists. The following listing and example run of test.pl call the subroutine check_file_exists and pass the argument new_file_name to the subroutine. The subroutine produces a warning message if indeed the output file we specify exists. The warn function prints the message we specify.
#cat test.pl #!/usr/bin/perl print "\n Work with /etc/passwd file on Linux system \n \n"; open (passwd_file, '/etc/passwd') || die ("passwd file unavailable \n"); @line = <passwd_file>; $length = @line; print "The length of the array containing the contents of / etc/passwd is $length \n \n"; print "\nEnter the name for which you want to search: "; chomp($search = <STDIN>); print "\nEnter the replacement name: "; chomp($replace = <STDIN>); print "\n You want to search for $search and replace with $re place \n\n"; print "\nEnter the name of the file to which you want to write the results: "; chomp($new_file_name = <STDIN>); open (STDOUT, ">$new_file_name"); # call subroutine check_file_exists &check_file_exists($new_file_name); foreach (@line) { s/$search/$replace/ig; print " $_"; } #subroutine to check if output file already exists sub check_file_exists{ if (-e $_[0]){ # die ("$_[0] exists, rerun program with new output file name \n"); warn ("$_[0] exists, you are going to write over $_[0] \n"); } } #./test.pl Work with /etc/passwd file on Linux system The length of the array containing the contents of /etc/passwd is 23 Enter the name for which you want to search: test Enter the replacement name: NEWTEST You want to search for test and replace with NEWTEST Enter the name of the file to which you want to write the re sults: passwd.out passwd.out exists, you are going to write over passwd.out #
You can see from this output that we have specified the name of an output file, passwd.out, that exists because the message that is part of our warn function is printed to the screen. The program still completes its task of searching and replacing, however, the warning message is printed. The die function, that is commented out of the check_file_exists subroutine, would cause the program to terminate.
There are many enhancements that we could make to this subroutine such as asking for another file name if the specified file name exists. In addition, we could include the return keyword, which would return data from the subroutine to our main program.
There are also predefined subroutines in Perl, such as AUTOLOAD, which holds the name of an incorrectly called routine, BEGIN-which enables you to specify code to be executed before the regular script is parsed, and END, which enables you to specify code to be executed before your regular script ends.
We have only scratched the surface of Perl functionality in this section. The Web sites sited earlier provide an excellent starting point for obtaining more information on Perl. The simple example programs that we built use many of the important concepts in Perl and give you a good starting point for crafting your own Perl programs.
CONTENTS |