File Search Utilities


Over time, most users tend to accumulate quite a large number of files, so much so that it becomes impractical to search for files merely by listing directories. At other times, you may be trying to execute a program and not finding it because it is not in a directory listed in your PATH environment variable. You obviously need something more powerful than ls when searching for files.

Searching with the locate Command

The quickest and simplest way to locate files on Linux is to use the locate command. It maintains an index of files on the filesystem and is able to return partial and exact matches very quickly. You can specify a complete filename or an expression that uses metacharacters as search criteria for the locate command. The locate command can also perform case-insensitive searches for filenames.

Try It Out Elementary Searching Using locate

Let s try using the locate command to search for a few files.

  1. First, try out the locate command with no options. Search on the string moon , like this:

       $ locate moon   /usr/share/icons/Bluecurve/16x16/apps/kmoon.png /usr/share/backgrounds/images/space/clem_full_moon_strtrk.jpg /usr/share/backgrounds/images/space/gal_earth_moon.jpg ... 

    As you can see, this searches the system for partial and exact matches, so the output here includes any file whose name contains the string moon .

  2. Now try searching with an expression that uses a metacharacter. The following command uses the * character as a wildcard:

       $ locate "*.conf"   /etc/sysconfig/networking/profiles/default/resolv.conf /etc/X11/gdm/factory-gdm.conf /etc/X11/gdm/gdm.conf ... 

    This returns any file whose name ends with the .conf extension.

  3. Now try an example using the -i option, to see the difference between case-sensitive and case-insensitive searches. First, without the -i option, you have a case-sensitive search:

       $ locate bluecurve   /usr/lib/gtk-2.0/2.0.0/engines/libbluecurve.la /usr/lib/gtk-2.0/2.0.0/engines/libbluecurve.so /usr/lib/gtk/themes/engines/libbluecurve.la ... 

    Now, run the same command with the -i option, to specify case-insensitivity:

       $ locate -i bluecurve   /usr/lib/gtk-2.0/2.0.0/engines/libbluecurve.la /usr/lib/gtk-2.0/2.0.0/engines/libbluecurve.so /usr/lib/gtk/themes/engines/libbluecurve.la ... /usr/share/pixmaps/nautilus/Bluecurve /usr/share/pixmaps/nautilus/Bluecurve/desktop-home.png /usr/share/pixmaps/nautilus/Bluecurve/Bluecurve.xml ... 

    You can see that the search results are different ”for example, the second search picked up the file Bluecurve.xml , whereas the first did not.

  4. The locate command maintains an index of files on the system, which is used to quickly look up filenames. This index is updated periodically (automatically) with new files. However, the “u option can also be used to update the index explicitly.

    To demonstrate , let s first create a file called new_file.txt using the touch command:

       $ touch new_file.txt   

    We then attempt to search for the file, but it returns no results even though the file has been created in the current directory. This is because the new file has not yet been added to the index of files:

       $ locate new_file.txt   $ 

    Now change the user to be the root user so that you have privileges to update the locate command s index. Run the locate command this time with the “u option, which updates the index:

     $ su -   Password: <not displayed>     # locate u     # exit   

    Now try to locate the file again from the revised index. This time the locate command is able to locate the file new_file.txt because it has been added to the index of files by the “u option:

       $ locate new_file.txt   /home/deepakt/new_file.txt 

Searching with the find Command

While the locate command is very useful for finding files quickly anywhere on the system, the search criteria that can be specified with locate are quite rudimentary. The find command has a much richer array of options. With the find command, you can search for files and subdirectories under a certain hierarchy, or specify various criteria such as the access time of the file, the size of the file, and so on. You can also use find to execute other commands on files that meet the search criteria. Therefore, while locate is very fast with fewer features, find is more feature-packed but slower.

Try It Out Advanced Searching Using find

Let s experiment with the find command to execute a number of searches for files, specifying various criteria. The following examples will give you a taste of what is possible with find .

Start off by searching for files under the /etc directory that end with the .conf extension:

   $ find /etc -name "*.conf" print   /etc/sysconfig/networking/profiles/default/resolv.conf /etc/X11/gdm/factory-gdm.conf /etc/X11/gdm/gdm.conf 

The first argument to find is the directory below which to search for files. Other arguments specify the search criteria to be utilized. In this case, we specify the “name option, which takes an expression representing a file or set of files as its argument. The search string may be a simple metacharacter expression or it can be the exact name of the file. Note that you use double quotes to specify the search string; this prevents the shell from interpreting the asterisk literally. Finally, you can specify the action to be performed once the search criteria are met. The “print option specifies the action; it indicates that the names of the files or directories matching the criteria should be displayed.

In the next search example, we look for all directories under /tmp whose name starts with the string ssh :

   $ find /tmp -type d -name "ssh*" 2>/dev/null   /tmp/ssh-XXl8hvoD /tmp/ssh-XXzdTPki /tmp/ssh-XX4DYEB5 /tmp/ssh-XX7KJPAN 

These are temporary directories created by the SSH programs run by various users. The “type option specifies the type to search for, in this case d , indicating a directory. (Similarly, f indicates a file.) You may have permission to list only those SSH temporary directories that belong to you. Hence, to redirect permission errors, use the shell redirection to redirect the error to the /dev/null device. The /dev/null device is actually a pseudo device that acts as a black hole, accepting input and discarding it at the same time.

This next search is based on the time of last access ”you do this by specifying the “amin option:

   $ mkdir foobar     $ date   Mon Jan 20 22:41:29 PST 2003   $ find . -type d -name "foo*" -amin -10 -print   ./foobar   $ ls -ld --time=atime foobar/   drwxr-xr-x    2 deepakt  users        4096 Jan 20 22:41 foobar/ The 10 option combined with the amin option specifies a search for files that have been accessed in the last ten minutes. Therefore, you are actually searching for files with names that begin with foo that have been accessed in the last ten minutes. The A +10 option would result in a search for files last accessed more than ten minutes ago. 

The next example combines the “amin option with the “or option:

   $ mkdir bar     $ find . -type d -name "foo*" -amin -10 -or -name "*bar"   ./foobar ./bar 

The “or option is actually the logical OR operator that can be used to combine two specified criteria. In this case, we search for files called foo* that have either been accessed in the last ten minutes, or files with names ending in bar . (The “and option allows you to combine two of the specified criteria in a logical AND expression.)

You can also search by file size, using the “size option. By specifying the size with a plus (+) or minus (“) sign before it, you indicate that the file should be greater or smaller in size than the specified number:

   $ find . -size -20k -and -size +10   ./.viminfo ./downloads/nop.txt ./downloads/p.txt ./downloads/curl ./downloads/curl/p.txt  ... 

In this case, the search is for files whose size is between 10 bytes and 20 Kilobytes (note that the “and option is used to achieve this).

The last search demonstrates the “ exec flag, which allows you to execute a command when the search criteria are met:

   $ find /tmp -type f -atime +1000 -exec rm {} \; 2>/dev/null   

The “exe c option is followed by the command to be executed, and optionally a pair of curly braces. The command is executed for each file that matches the criteria. The curly braces indicate the matched file. The semicolon ( ; ) indicates the end of the command; it is escaped (by a preceding \ character) to prevent the shell from interpreting it. In this case, we search and delete all files under the /tmp directory that have not been accessed in the last 1,000 minutes.

Searching with the GNOME Search Tool

The GNOME search tool is a graphical front-end to the locate and find commands. It can be invoked from the Main Menu by clicking Search for Files (see Figure 5-1).

click to expand
Figure 5-1

Filenames can be entered in the File is named text box, and a directory in which to start the search can be specified in the Look in folder box. The preceding example shows a search for the file hosts in the /etc directory.

Wildcard searches are also possible using the expression syntax we used for locate . For example, host* would search for all files starting with the characters host (and followed by one or more characters), and would therefore match hosts , hosts.deny , hosts.accept , and so on.

Clicking the Additional Options button presents an advanced search interface using rules, as described in Chapter 2.

Text Searches

Combining file searches with text searches within files allows you to quickly pinpoint the files that have the text you are looking for. You know how to search for a file by its filename (and also by its location); but how do you search the filesystem for files that contain a particular string of text? This type of search is very useful, but quite different from what you can achieve using locate or find .

The grep command is the most common way to search for a string in a file. In fact, it can search multiple files at the same time. The grep command, by default, performs a case-sensitive search, although it can perform a case-insensitive search if required.

The xargs command reads its standard input and feeds it as arguments to any command that is supplied as an argument to it. In other words, xargs will dynamically create the arguments for a command. This makes it particularly useful for chaining commands using the pipe ( ) shell operator. As illustrated in Chapter 6, simply using a pipe operator feeds the output of a command as input to another command. The xargs command in conjunction with the pipe operator feeds the output of a command as arguments to another command. The following Try it Out section illustrates its use.

Try It Out Using grep and xargs

Let s try out some grep searches and combine them with the xargs and find search utilities.

The grep command takes the string or expression to match as the first argument followed by one or more files to search. In the following example, grep searches for the string nobody in the files /etc/passwd and /etc/ group :

   $ grep nobody /etc/passwd /etc/group   /etc/passwd:nobody:x:99:99:Nobody:/:/sbin/nologin /etc/passwd:nfsnobody:x:65534:65534:Anonymous NFS User:/var/lib/nfs:/sbin/nologin /etc/group:nobody:x:99: /etc/groupnfsnobody:x:65534: 

The grep command prints the names of the files in which the string was found, followed by a colon character ( : ) and the occurrence of the string within those files.

The “i option causes grep to search for the string or expression ignoring case. In the next example, the line corresponding to the Deepak Thomas user is displayed as a match because the string Deepak Thomas matches a case-insensitive search for the expression de* :

   $ grep -i "de*" /etc/passwd   deepakt:x:501:100:Deepak Thomas:/home/deepakt:/bin/bash 

Finally, look at the combination of the find , xargs , and grep commands to search for files in the /etc directory that contain the string farfalla (which, in this case, happens to be the name of the machine the command is run on):

   $ find /etc -type f 2>/dev/null xargs grep farfalla 2>/dev/null   /etc/sysconfig/networking/profiles/default/network:HOSTNAME=farfalla /etc/sysconfig/network:HOSTNAME=farfalla /etc/hosts:127.0.0.1    farfalla        localhost 

The find command locates all the files in the /etc directory. This list is piped as the input to the xargs command. The xargs command in this case has one argument, the grep command. The grep command also has an argument, that being the string to search for. Internally, the xargs command constructs a command line that looks like this:

 grep farfalla file1 file2 ... filen 

Here, file1, file2, ..., filen are the results of the search by the find utility.

Other more sophisticated text searching tools exist. For example, the awk command is considered to be a language in itself, and allows for rich expressions and control flow. Another command, sed (the stream editor ), is intended to transform input text streams, typically files or the output of commands.

The following awk script prints the login names and the actual names of users on the local system:

   $ awk -F: '{print , "-", }' /etc/passwd   root - root bin - bin daemon - daemon ftp - FTP User nobody - Nobody rpc - Portmapper RPC user vcsa - virtual console memory owner nscd - NSCD Daemon sshd - Privilege-separated SSH nfsnobody - Anonymous NFS User xfs - X Font Server ident - pident user apache - Apache webalizer - Webalizer deepakt - Deepak Thomas 

The “F option here indicates the field separator. In this case, it is the colon ( : ) character that is used to separate fields in the /etc/passwd file. The awk action prints the first and fifth columns of the file to the standard output (that is, the terminal), with a hyphen ( - ) character separating column entries on each line.

The following sed command line transforms a string in the seashells.txt file. The expression 's;sells; shells ;' indicates that the pattern sells should be substituted with the pattern shells (the letter s indicates substitution):

   $ cat >seashells.txt     She sells sea shells on the sea shore     ^D     $ sed -e 's;sells;shells;' seashells.txt   She shells sea shells on the sea shore 

The commands awk and sed do not modify the input file directly. Instead, they treat the file as a stream of input and write the output to the standard output (in this case, the terminal). As shown in the next example, it should be possible to capture the output to a file as well; output redirection using the > operator is dealt with in detail in Chapter 6. For now, you need only to understand that the > operator causes any input to it to be directed to the filename following it

   $ sed -e 's;sells;shells;' seashells.txt > seashells2.txt     $ cat seashells2.txt   She shells sea shells on the sea shore 



Beginning Fedora 2
Beginning Fedora 2
ISBN: 0764569961
EAN: 2147483647
Year: 2006
Pages: 170

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net