# Chapter 27. Finding Stuff

## Chapter 27. Finding Stuff

Introduction

Regular Expressions

Related Files

Commands

### Introduction

This section is an amalgamation of some very different programs, which are united under the eponymous purpose of "finding stuff," but work in quite different ways.

Generally speaking, find (the biggie), searches the directory tree for a file or files specified by the command line. This happens to present a golden opportunity for hard-core programming in that it is a real world example of a classic computer science problem. As such, the find command has quite a few options and is perhaps more complex than necessary. For those of you simply wishing to learn the location of a particular file in a timely and painless manner, I recommend locate.

### Regular Expressions

Most of the search mechanisms listed in this chapter use regular expressions in one form or another. A regular expression is a mathematical mechanism for specifying the ordering of symbols. Historically, regular expressions originated from discussions of the Theory of Computation, a seriously hard- core branch of mathematics that is far removed from the day-to-day rigors of, say, locating the smutty email to your girlfriend that you misplaced in some gargantuan file system at work.

However, the same principles govern both tasks . To get full value for your Linux dollar, you will need some understanding of regular expressions and how they are used to specify search patterns.

Regular expressions exist to give you a mechanism to specify patterns of characters. The implementation of a regular expression includes three classes of characters :

 literals The literal character you typed in (a, b, c…, 1, 2, 3… etc.). wildcards Special characters used to represent one or more characters other than themselves . For example, the "*" character will match any number of any other characters ("d*" matches "date", "day", "dally", and anything else starting with the letter d.) The "." character will match one instance of any other character ("d.te" matches with "date" and "dote"). metacharacters Metacharacters are characters that have a special meaning. For example, the caret character "^" usually matches the beginning of a line. The "\$" character matches the end of a line.

In addition, it is possible to specify groups of characters. For example, the regular expression "[aAbB]*"would match any string of any length that started with the letter a or b, either uppercase or lowercase. (This group is delimited by square braces.) See the grep entry for more information about the implementation of regular expressions.

The commands covered in this section include

 egrep Grep with extended regular expressions find Search the directory tree finger Display information about a user fgrep Grep variation for matching fixed strings grep Search for a pattern in a file locate Search the locate database for a file updatedb Update the locate database which Search the directories of your \$PATH for a file

### Related Files

 /var/lib/locatedb Default locate database.

### Commands

find

[ path …][expression]

The find command searches the directory tree specified by the path argument for the pattern(s) indicated in the expression argument.

The expression consists of options, tests, and actions. Options affect overall operation rather than the processing of a specific file. Tests return a true or false value based on evaluation of some condition. Actions have side effects and return a true or false value.

Example: To search the entire system for a file named "abcd.txt", use

find / - name abcd.txt -print

Example: To search all the files in the directory tree of user "someguy" for the string "blah", use:

find /home/someguy -type f -print xargs grep -n "blah"

 A common mistake with the find command is assuming that it displays output by default. It doesn't. If you want to see what find found, you must specify one of the various print options (e.g., -print).

 -daystart Tell find to measure times relative to the start of the current calendar day rather than 24 hours ago. -depth Perform a depth first search, rather than a breadth first search. That is, search the contents of each directory before the directory itself. -follow Follow any symbolic links encountered in the search. -help, --help Display a help summary and exit. -maxdepth levels Descend at most the specific number of levels below the starting point. -mindepth levels Do not act on directories above the specified depth. -mount Do not traverse directories on other filesystems. -noleaf Do not increase search speed by discounting the "." and ".." directories in calculations. Used in searching CD-ROM and MS-DOS filesystems. -version, --version Display the version number and exit. -xdev Do not traverse directories on other filesystems.

Tests

Numeric arguments can be specified as follows :

 +n greater than n -n less than n n exactly n

 -amin n Specify that the file was last accessed n minutes ago. -anewer file Look for files last accessed more recently than the specified file was modified. -atime n File was last accessed n *24 hours ago. -cmin n File's status was last changed n minutes ago. -cnewer file File's status was last changed more recently than file was modified. -cnewer is affected by -follow only if -follow comes before -cnewer on the command line. - ctime n File's status was last changed n *24 hours ago. -empty File is empty and is either a regular file or a directory. -false Always false. -fstype type Search for the file on filesystems of the specific type. -gid n Specify the file's numeric group id. -group gname Specify the file group name (or, allowably, numeric group ID). -ilname pattern Like -lname, but the match is case insensitive. -iname pattern Case insensitive lname match. -inum n File has inode number n . -ipath pattern Like -path, but the match is case insensitive. -iregex pattern Case insensitive regex search. -links n Specifies that file has n links. -lname pattern File is a symbolic link whose contents match the specified pattern. -mmin n File's data was last modified n minutes ago. -mtime n File's data was last modified n *24 hours ago. -name pattern Tell find to look for a file whose basename matches the specified shell pattern. - newer file Tell find to look for files modified more recently than the specified file. -nouser No user corresponds to file's numeric user ID. -nogroup No group corresponds to file's numeric group ID. -path pattern Match file name to the specified shell pattern. -perm mode Exactly match the file permissions to the specified mode. -perm -mode All of the permission bits mode are set for the file. -perm +mode Any of the permission bits mode are set for the file. -regex pattern Match filename to the specified regular expression. - size n[bckw] Match file size to the specified unit of space (b = 512 byte blocks, c = bytes, k = kilobytes, w = 2 byte words). -true Always true. -type c Specify file typeone of the following: b block (buffered) special c character (unbuffered) special d directory p named pipe (FIFO) f regular file l symbolic link s socket -uid n Match file's user id to n . -used n File was last accessed n days after its status was last changed. -user uname Uname is the file's owner. -xtype c The same as -type unless the file is a symbolic link. If the file is a symbolic link and the -follow option is not set, then true if the file is a link to file of type c. If the -follow option is not set, then true if the specified type is l.

Actions

 - exec command ; Execute the specified command. The semicolon ";" indicates termination of command's argument set. -fls file True; like -ls, but write to file the same way as - fprint . -fprint file True; redirect file name output into the specified file. -fprint0 file True; like -print0, but write to file the same way as -fprint. -fprintf file format True; like -printf, but write to file the same way as -fprint. -ok command ; Like -exec option, but prompt before running. -print Send filenames to standard output. -print0 Send filenames to standard output teminated with a null character. -printf format Use the specified format when printing to standard output. Recognizes the following "\" escapes and "%" directives: \a Alarm Bell \b Backspace. \c Stop printing and flush output immediately. \f Form feed. \n Newline. \r Carriage return. \t Horizontal tab. \v Vertical tab. \\ A literal backslash ("\"). %% A literal percent sign. %a File's last access time. %Ak File's last access time in the format specified by k, which is either "@" or a directive for the C "strftime" function (specified next ). @ Seconds since Jan. 1, 1970, 00:00 GMT.

Time Fields

 H hour (00..23) I hour (01..12) k hour (0..23) l hour (1..12) M minute (00..59) p locale's a.m. or p.m. r time, 12 hour (hh:mm:ss [AP]M) S second (00..61) T time, 24 hour (hh:mm:ss) X locale's time representation (H:M:S) Z time zone (e.g., EDT)

Date Fields

 a weekday name abbreviations (Sun., Mon….) A weekday name (Sunday, Monday…) b month name abbreviations (Jan., Feb….) B full month name (January, February…) c date and time (Sat Nov 04 12:02:33 EST 1989) d day of month (01..31) D date (mm/dd/yy) h same as b j day of year (001..366) m month (01..12) U week number of year with Sunday as first day of week (00..53) w day of week (0..6) W week number of year with Monday as first day of week (00..53) x locale's date representation (mm/dd/yy) y last two digits of year (00..99) Y year (1970…) %b File size in blocks. %c Last file status change time. %Ck File's last status change time in the format specified by k. %d File's depth in the directory tree. %f File name with leading directories removed. %F Type of the filesystem where the file is located. %g File's group name. %G File's numeric group ID. %h Leading directories of file's name. %H Command line argument under which the file was found. %i File's inode number (in decimal). %k File's size in 1K blocks. %l Object of symbolic link (empty if file is not symbolic link). %m File's permission bits (in octal). %n Number of hard links to file. %p File's name. %P File's name minus the command line argument under which it was found. %s File's size in bytes. %t File's last modification time. %Tk File's last modification time in the format specified by k. (See %A above, for format details.) %u File's user name. %U File's numeric user ID. -prune If -depth is not given, true; do not descend the current directory. If -depth is given, false; no effect. -ls True; list current file n ls -dils' format on standard output.

Operators

Listed in order of decreasing precedence:

 ( expr ) Force precedence. !, -not Logical negation; expr True if expr is false. expr1 expr2, expr1 -a expr2, expr1 -and expr2 Logical and. expr2 not evaluated if expr1 is false. expr1 -o expr2, expr1 -or expr2 Logical or. expr2 not evaluated if expr1 is true. expr1 , expr2 List; both expr1 and expr2 are always evaluated. The value of expr1 is discarded; the value of the list is the value of expr2.

finger

[-lmsp] [user …] [user@host …]

The finger command displays information about system users. If no argument is specified, finger will print out information on all users currently logged in. User may be a remote user; if so, use the "user@host" style of specification.

Example: To get information about the login status of user "jlevy", use

finger jlevy@diana.gov

 Information stored in the file .plan in your home directory is printed to the screen whenever anyone fingers you. This was originally included so that you could keep fellow users up to the minute on your current doings (e.g., "In important meeting", "At lunch , back in an hour."). In actual practice, no one bothers to keep his or her .plans up to date and the file is almost universally used as a repository for obscure quotations.

 -s Display user's login name, real name, terminal name, write status, idle time, login time, office location, and office phone number. -l Long list. Display home directory; phone number; login shell; mail status; and .plan, .project, and .forward files. -p Do not display .plan and .project files. -m Do not match user names .

grep

[-[AB] NUM] [-CEFGVbchiLlnqsvwxyUu] [-e PATTERN -f FILE] [--extended-regexp] [--fixed-strings] [--basic-reg-exp] [--regexp= PATTERN ] [--file=FILE] [--ignore-case] [--word-regexp] [--line-regexp] [--line-regexp] [--no-messages] [--revert-match] [--version] [--help] [--byte-off-set] [--line-number] [--with-filename] [--no-filename][--quiet] [--silent] [--files-without-match] [--files-with-matches] [--count] [--before-context=NUM] [--after-context=NUM] [--context] [--binary] [--unix-byte-offsets] files…

grep (Global Regular Expression Parser) searches through the input set for any matches to the specified pattern and (by default) outputs any matching lines. The command may also be invoked as egrep (a.k.a. grep -e) or fgrep (a.k.a. grep -f).

Example: To output the lines and numbers of any variable starting with "cha" in the c files in the current directory, use

grep -n cha *.c

Grep has three modes of use, as specified by the following options:

 -G, --basic-regexp Interpret the specified pattern as a basic regular expression. -E, --extended-regexp Interpret the specified pattern as an extended regular expression. -F, --fixed-strings Interpret the specified pattern as a list of fixed strings to be matched.

The grep family accepts the following options:

 -NUM Include the specified number of lines of leading and trailing context. -A NUM Include the specified number of lines of trailing context. -B NUM Include the specified number of lines of leading context. -b, --byte-offset Include byte offset in any output. -c, --count Instead of matching lines, output a count of matching lines. -e PATTERN, --regexp= PATTERN Use the specified pattern (including those beginning with "-"). -f FILE, --file= FILE Use the specified FILE as the pattern source. -h, --no-filename Do not include filenames in output. -i, --ignore-case Treat uppercase and lowercase letters as equivalent. -L, --files-without-match Output only the names of files that contain no matches to PATTERN. -l, --files-with-matches Output only the names of files that contain matches to PATTERN. -n, --line-number Include line numbers in any output. -q, --quiet Output nothing, halt if match found. -s, --silent Do not output error messages about bad files. -v, --revert-match Output only those lines that do NOT match the specified pattern. -w, --word-regexp Output only those lines that match whole words. -x, --line-regexp Output only full line matches. -y Synonym for -i. -U, --binary Treat all files as binary. -u, --unix-byte-offsets Output UNIX-style byte offsets.

Grep recognizes the following in pattern specification:

Specifying Character Classes

 [:alnum:] Alphanumeric characters [0-9A-Za-z] [:alpha:] Alphabetic characters [A-Za-z] [:cntrl:] Control characters [:digit:] Digits [:graph:] Graphic characters [:lower:] Lowercase characters [:print:] Printable characters [:punct:] Punctuation characters [:space:] Whitespace (space, tab…) [:upper:] Uppercase characters [:xdigit:] Hexadecimal digits

Specifying Position

 . Matches any single character ^ The beginning of a line \$ The end of a line \< Beginning of a word \> End of a word \b Empty string at edge of a word \B Empty string not at edge of a word

Specifying Pattern Repetition

 ? Match the preceding item at most once. * Match the preceding any number of times (including none). + Match the preceding one or more times. {n} Match the preceding exactly n times. {n,} Match the preceding n , or more times. {,m} Match the preceding 0- m times. {n,m} Match the preceding at least n , but no more than m times.

locate

[-d path][--database=path][--version][--help] pattern…

Locate searches a database of system files and locations for the specified pattern. The pattern specification can include shell metacharacters (*, ?, . , etc.)

Example: To find the file abcd.txt on any directory indexed by the updatedb command, use

locate abcd.txt

Locate is a quick and simple alternative to the find command, but you need to update the database regularly with updatedb.

 -d path, --database=path Tell locate to search the specified database for the pattern, rather than the default database. --help Print a summary of the options to locate and exit. --version Print the version number of locate and exit.

updatedb

[options]

This command updates the database of file names and locations used by the locate command.

Example: To update the default database file, use

updatedb

 Set your cron to run updatedb every so often.

 --localpaths='path1 path2...' Specify the (nonnetwork) directories to be included in the database. --netpaths= 'path1 path2...' Specify the network directories to be included in the database. --prunepaths= 'path1 path2...' Specify directories to be excluded from the database. --output= dbfile Specify the database file to be built (typically, /usr/local/var/locatedb ). -- netuser = user Specify the user identity to be used when searching network directories. --old-format Specify the database will be created in the old format. --version Display version information and exit. --help Display help information and exit.

which progname …

Searches the user's path for the specified program and prints its full pathname, if found.

Example: To find the location in the directory tree of the ls command, use

which ls