Using ROT13 Encoding with sed


Using ROT13 Encoding with sed

In Usenet newsgroups (among other places), text is often encoded with something called ROT13, which is an abbreviation for "rotate (the alphabet by) 13." That is, A becomes N, B becomes O, and so forth. If text is encoded, people have to take extra steps to decode the message. For example, if a message includes an offensive joke, people who don't want to see the joke won't have to. Similarly, if the message is a movie review, people who don't want to know the ending won't have the surprise spoiled. Instead, the message encoded with ROT13 might look like this:

Tbbq sbe lbh--lbh svtherq vg bhg! Naq ab, gurer'f ab chapuyvar. Ubcr lbh rawblrq gur obbx! Qrobenu naq Revp 

A great way to use ROT13 encoding (and decoding) is with sed, which will let you easily manipulate text.

To use ROT13 encoding with sed:

1.

vi script.sed

Use the editor of your choice to create a file called script.sed. Because the command we're using will be reused, we'll use a sed script instead of just typing in everything at the shell prompt.

2.

y/abcdefghijklmnopqrstuvwxyzABCDEFGH IJKLMNOPQRSTUVWXYZ/

Start with a y at the beginning of the command. y is the sed command to translate characters (capital to lowercase or whatever you specify).

After y, type a slash (/), the original characters to look for (all lowercase and uppercase characters), and another slash.

3.

y/abcdefghijklmnopqrstuvwxyzABCDEFGH IJKLMNOPQRSTUVWXYZ/nopqrstuvwxyz abcdefghijklmNOPQRSTUVWXYZABCDEFG HIJKLM/

After the second slash, add the translation characters (the lowercase alphabet, starting with n and continuing around to m, then uppercase from N to M), followed by a slash to conclude the replace string.

4.

Save the script and exit the editor.

5.

sed -f script.sed limerick | more

Test the ROT13 encoding by applying it to a file. Here we apply it to the limerick file, and then pipe the output to more for your inspection. You'll see that all you get is gibberish. To test it more thoroughly, use sed -f script.sed limerick | sed -f script.sed | more to run it through the processor twice. You should end up with normal text at the end of this pipeline (Code Listing 17.6).

Code Listing 17.6. A spiffy sed command can ROT13 encode and decode messages.

[ejr@hobbes creative]$  sed -f script.sed  limerick Bhe snibevgr yvzrevpx 1. Gurer bapr jnf n zna sebz Anaghpxrg, Jub pneevrq uvf yhapu va n ohpxrg, Fnvq ur jvgu n fvtu, Nf ur ngr n jubyr cvr, Vs V whfg unq n qbahg V'q qhax vg. [ejr@hobbes creative]$  sed -f script.sed  limerick | sed -f script.sed Our favorite limerick 1. There once was a man from Nantucket, Who carried his lunch in a bucket, Said he with a sigh, As he ate a whole pie, If I just had a donut I'd dunk it. [ejr@hobbes creative]$ 

Tips

  • Text is rotated by 13 simply because there are 26 letters in the alphabet, so you can use the same program to encode or decode. If you rotate by a different number, you'll need to have separate programs to encode and decode.

  • Check out the next section to see how to make this lengthy process into a shell script and make it even easier to reuse over and over.





Unix(c) Visual Quickstart Guide
UNIX, Third Edition
ISBN: 0321442458
EAN: 2147483647
Year: 2006
Pages: 251

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net