Item 60: Some interesting Perl one-liners.

You can pack a lot of meaning into a single line of Perl. In this Item, I've selected and explained a few interesting Perl one-liners for you. Study them to get a feel for the kinds of complicated and/or unusual things you can accomplish in a single line of Perl.

`select((select(SOCK), $=1)[0])`

What's a convenient way to turn off filehandle buffering?

This is a hoary old standby, probably due to Randal, that does a fair job of demonstrating the lengths that Perl programmers will go to in order to avoid creating temporary variables . This snippet of code is actually useful, and it has appeared in production code from time to time. A long, boring version of this one-liner would look something like:

 {    my $old = select SOCK;    $ = 1;    select ($old);  }

Save current fh, select SOCK

Turn off buffering.

Reselect previous filehandle.

`[ $a => $b ] -> [ $b <= $a ]`

This wonderfully symmetrical one-liner contributed by Phil Abercrombie returns the lesser of $a and $b .

It can be written with less wasted technology, but then it isn't nearly as pretty:

 ($a, $b)[$b <= $a]

`s/\G0/ /g`

Once upon a time in 1996, someone asked comp.lang.perl.misc if there was a way to replace leading zeros in a string with spaces. This was Randal's response.

This substitution uses the \G anchor, which works with the /g pattern match flag. The \G anchor refers to either the beginning of the string or the end of the previous /g match for that pattern. When the pattern match starts up, /\G0/ matches a at the start of the string. If the match is successful, /\G0/ will match another if it immediately follows the first one. The pattern will continue matching until it encounters a character that isn't a . While this is going on, the s are being replaced with spaces.

`/^(?=.?this)(?=.?that)/`

The question often arises, "How can I match one thing and another thing with a regular expression?" Assuming the person asking isn't too confused , the right answer usually looks like:

 /this/ and /that/

In other words, use two match operators. However, shortly after Perl's positive lookahead pattern match feature (?= ) was introduced, Randal came up with this showy alternative.

The zero-width positive lookahead operator, (?=foo) , matches if the contents enclosed by the operator (in this case, foo ) appear immediately to the right of the current position in the pattern match. The contents do not, however, become part of the match itself. Now, obviously (obviously?), if the beginning of the string is followed by something matching .*?this and something matching .*?that , the string must contain both this and that .

`[^\D5]`

Here's the answer to the question, "How do I match any digit except 5?" This character set [\D5] is the digit 5 plus everything that isn't a digit. Its complement ( ^ ) is any digit that isn't 5 .

This same principle is useful for creating patterns like [^\W\d] any word character that isn't a digit. This is especially helpful in the presence of use locale .

`@uniq = sort keys %{ { map { $_, 1 } @list } }`

What's a good way to eliminate all the duplicates from a list? All of the good (meaning, efficient) answers to this question will involve creating a hash. If we unroll this somewhat, we have:

 {    my %h = map { $_, 1 } @list;    @uniq = sort keys %h;  }

Create hash with elems of @list as keys, then sort the keys.

What makes this mildly nifty (or maybe just confusing) is the use of the anonymous hash constructor { } to hold the temporary result. Interestingly, each pair of braces appearing here has a different function:

map { $_, 1 } @list	A list suitable for initializing a hash `($list[0], 1, $list[1], 1, ...)` .
{ map { $_, 1 } @list }	A reference to an anonymous hash initialized with those values.
%{ { map { $_, 1 } @list } }	The " name " of the dereferenced hashsuitable as an argument for `keys` .

`@rank[sort {$x[$a] cmp $x[$b]} 0..$#x] = 0..$#x`

Suppose that you have a list of items that are not in sorted order. You would like to know, for each item in the list, what position that item would have in the list if it were sortedcall this the " ranks:obtaining by sortingranks" of the elements. For example, suppose your list is:

 qw(jane elroy george judy)

Then the desired output is:

 2 0 1 3

This corresponds to the positions of jane , elroy , george , and judy in a sorted list:

 elroy george jane judy

The string jane has a rank of 2 because it sorts third, the string elroy has a rank of because it sorts first, and so on.

It seems like this problem ought to have a simple answer, but most people lapse into some serious headscratching after starting to work on it. (This includes meI had to ponder it for a few hours.) Let's start working toward a solution by just sorting the list:

 @x = qw(jane elroy george judy);  @x_sorted = sort @x;

 elroy george jane judy

Well, that's okay, but what we actually need to sort is a list of element indices:

 @i_sorted =    sort {$x[$a] cmp $x[$b]} 0..$#x;

1 2 0 3

First sorted is $x[1] = elroy , second is $x[2] = george , etc.

The value associated with each of the elements in the sorted list is the index of the element in the original, unsorted list. The string "elroy" is element 1 in the original list, the string "george" is element 2 , and so on. We can use these indices to construct the list of ranks. The string "elroy" was element 1 , and since it is the first element in the sorted result, it has rank . We can say:

 $rank[1] = 0;

The rank of elroy (element 1 in the original list) is .

For "george" , which was element 2 with rank 1 , we have:

 $rank[2] = 1;

The rank of george (element 2 in the original list) is 1 .

Or we can write the whole process out as a slice:

 @rank[1, 2, 0, 3] = 0..3;

Replacing the constants with the expressions that derived them gives us our final answer:

 @rank[sort {$x[$a] cmp $x[$b]} 0..$#x] = 0..$#x;

Tricky.

This one was due to Randal (but of course).

`"$_ is string\n" if (~$_ & $_) ne '0'`

Perl values can contain either strings or numbers , or both (see Item 6). Suppose you would like to find out whether a value in $_ is a string. Although modules like Devel::Peek (see Item 37) can reveal what the internal structure of a value is, you can get a glimpse of Perl's innards without using special modules at all.

The bitwise operators ~ and & operate differently on numbers and stringsbytewise when applied to strings, and on the bits of integers when applied to numbers. You can take advantage of this to distinguish between numeric and string values, because the expression ~$_ & $_ will yield a string of zero or more nulls if $_ contains a string value. On the other hand, if $_ contains a number, it yields the number . Distinguishing between the number and a possibly empty string of nulls is a little trickymaking a string comparison against '0' is the simplest.

`perl -pe 's/\n/" " . <>/e' data`

Randal posted something like this in reponse to a request for a program that would take lines from a file and join them together in pairs. For example, here's an input file:

 Testing  one  two  three

This one-liner turns the input into the following:

 Testing  one  two  three

Each line is two of the old lines joined together with a space.

The -pe command line option used above (a combination of -p and -e ) yields a program that acts like the following:

 while (<>) {    s/\n/" " . <>/e;    print;  }

I sputtered a bit when I saw this one-liner for the first time because I had never thought of using <> in a substitution. But it is fairly straightforward otherwise . Note that you have to substitute for \n . Nothing elsefor example, the $ anchorwill work.

select((select(SOCK), $=1)[0])

[ $a => $b ] -> [ $b <= $a ]

s/\G0/ /g

/^(?=.*?this)(?=.*?that)/

[^\D5]

@uniq = sort keys %{ { map { $_, 1 } @list } }

@rank[sort {$x[$a] cmp $x[$b]} 0..$#x] = 0..$#x

"$_ is string\n" if (~$_ & $_) ne '0'

perl -pe 's/\n/" " . <>/e' data