Project 17. Get Clever Finding Files"What files have I modified today, and do I have any larger than 20 MB?" This project shows you how to search the file system for specific files based on file type, size, timestamp, and permissions. It uses find to find files and perform simple processing on them. Project 15 uses locate and find to search for files by name. Project 18 shows how to process the files that were found. Project 20 gives some handy find tips. Search CriteriaThe find command is very powerful. It recursively searches an entire directory structure for files that match a given set of criteria. The criteria are based on:
The find command generates a list of files that match the specified criteria. Sometimes this is all you want, but often you'll need to process either the list itself or each file named in the list. There are several ways in which you can do this:
Find CriteriaWhen working with Unix files and directories, it's often helpful (or necessary) to identify files that meet one or more criteria, either as a means of simply locating desired information or content, or as the first step in a process that involves comparing, sorting, or performing other operations based on those criteria. The key to many criteria-based searches are in the option settings for command find. Find by Filename and PathnameRefer to Project 15, which shows you how to search the file system for specific files based on filename and pathname. Tip
Find by TypeThe find command normally considers all files, but if you want to limit the search to a specific type of file, use the primary -type. The type can be a regular file, a directory, a symbolic link, or a special type such as a socket. To search for directories only (-type d), type $ find . -type d -iname test ./test ./Trial/One/Test ./Trial2/version1/test To search for files only (-type f), type $ find . -type f -iname test ./test/test ./Trial2/test ./Trial2/version1/test/a/test Find by SizeThe find command can search for files whose size is equal to, greater than, or less than a given size. Precede the given size with plus to search for files bigger than that size and minus to search for files smaller than that size. The size is specified as the number of 512-byte blocks (that is, it's specified in units of 512 bytes). To give some examples:
To find all pictures bigger than 10 M bytes, we would type $ find ~/Pictures -size +20480 To find all empty files, we could use either of the following. $ find . -size 0 $ find . -size -1 Note
For teensy-weensy files, for which blocks are too coarse a measure, specify the size in bytes (characters) by appending c to the size. Here, we find all files of exactly 19 bytes. $ find . -size 19c -ls 996343 8 -rw-r--r-- 1 saruman saruman ... ./Im19Bytes The primary -ls tells find to list the file's details instead of just its name. Find by TimestampThe find command can search for files based on their time stamp, either
The find command works by units of time. A unit can be either a minute or a day (24 hours). Time is calculated as the difference between the time stamp of a file and the time find itself was started. The difference is rounded up to the next unit when testing for equality (last modified one minute ago) but is not rounded otherwise (last modified less than one minute ago). Therefore, we can specify criteria such "last accessed one minute ago,""last modified less than two days ago," or "last modified more than seven days ago." The primaries are -amin, -mmin, -atime, and -mtime. a means access time, and m means modification time; min means units of minutes, and time means units of days. As with size, units of time that are preceded by plus mean a number of units greater than the amount specified, and units of time that are preceded by a minus mean fewer units than the amount specified. As an example, let's create two files and check that they were both modified/created less than one minute ago. The primary -mmin -1 is formed by m, meaning modified, and min, meaning units of minutes. The value -1 means a time less than one unit before find was invoked. $ touch f-mod f-access $ find . -mmin -1 ./f-access ./f-mod Repeat the find command until it reports no files; then proceed. (If you're a very slow typist, change the time period to two minutes.) Now let's use the command touch -a to access (but not modify) file f-access and thereby change its access time stamp. Then we'll check to see which files were accessed, and which files were modified, less than one minute ago. $ touch -a f-access $ find . -mmin -1 $ $ find . -amin -1 ./f-access Next, we use the command touch -m to change the modification time of the file f-mod (we could have achieved the same thing by editing it, but touching it is easier) and then check which files were accessed, and which files were modified, less than one minute ago. Depending on how long we take to type the commands, the file f-access may or may not be reported as being accessed less than a minute ago. $ touch -m f-mod $ find . -mmin -1 ./f-mod $ find . -amin -1 ./f-access Note
Finally, wait a few minutes and try again. $ find . -mmin -1 $ find . -amin -1 Here are a couple more examples. Learn More
Find all files in your home directory modified within the last 24 hoursuseful when you want to perform a daily backup. $ find ~ -mtime 1 Find files you've forgotten about. $ find ~ -atime +1000 Find by Owner, Associated Group, and PermissionsThe find command accepts search criteria including file owner (or user owner, in Unix terminology), associated group, and permissions. Here are a few examples of how they work. To find all files owned by the user saruman in the directory /Users/saruman, we use the primary -user as our search criterion. $ find /Users/saruman -user saruman (This is user saruman's home directory, so we'll omit the long list of matching files.) To find all files in the same directory that aren't owned by the user saruman, we use the primary -not to invert the sense of any criteria that follow (in this case, the primary -user again). Not surprisingly, the results list is much shorter this time. $ find /Users/saruman -not -user saruman /Users/saruman/Development/c32-1 Let's find all pictures that are not associated with the group saruman. A group criterion is introduced by the primary -group. $ find /Users/saruman/Pictures -not -group saruman -ls 871501 6368 -rw-r--r-- 1 saruman admin 320911 Mar 12 09:41 /Users/saruman/Pictures/people/Domi/sledges2photo1.psd 823934 320 -rw-r--r-- 1 saruman admin 123297 Feb 10 19:06 /Users/saruman/Pictures/web-site/jan/home As you may have noticed, the preceding command uses yet another primary of the find command called -ls. Not to be confused with Unix command ls, it instructs find to display a file's details instead of just its name. To specify permissions as search criteria, we use the primary -perm and express permissions in the octal or symbolic formats expected by command chmod. (If you are unfamiliar with these concepts, refer to Project 8.) The following examples use find in a directory containing just one file, xxx. Each example uses permissions as search criteria to see whether file xxx matches. Follow them carefully to understand how the criteria are matched. First, we'll use ls to display the permissions for file xxx. You'll see that the permissions grant write access to owner saruman and read access to group saruman and everyone else (others): $ ls -l xxx --w-r--r-- 1 saruman saruman 0 20 May 23:42 xxx Our first example seeks files with permissions set exactly as stated in our search criteria. Files will match only if their permission settings match the -perm criteria and all unspecified permissions are unset. The permissions in this example match those of xxx exactly. $ find . -perm u+w,g+r,o+r ./xxx If we modify the search criteria, additionally specifying write access to others, find no longer locates file xxx. Its permissions no longer match the search criteria exactly. $ find . -perm u+w,g+r,o+rw $ We can find files that have one or more of the stated permissions set by preceding the permissions with a plus sign (+). File xxx now matches again. $ find . -perm +u+w,g+r,o+r ./xxx We can find files that have all of the stated permissions set (but may also have others set) by preceding the permissions with a minus sign (-). $ find . -perm -u+w,g+r,o+r ./xxx Changing the permissions on file xxx to remove "others read" will cause find to fail. The file no longer has all of the stated permissions. $ chmod 240 xxx $ ls -l xxx --w-r----- 1 saruman saruman 0 20 May 23:42 xxx $ find . -perm -u+w,g+r,o+r $ This example will find file xxx because a plus sign means one or more permissions, not all permissions. $ find . -perm +u+w,g+r,o+r ./xxx When specifying an exact match or all (but not one or more), you may specify that particular permissions should not be set. o-r, for example, means that "other read" should not be set. $ find . -perm u+w,g+r,o-r ./xxx $ find . -perm -u+w,g+r,o-r ./xxx Use Complex ConditionsThe find command has a very powerful syntax that lets you combine primaries into expressions by using AND (-and), OR (-or), and NOT (! -false -not) operators, thereby creating complex search criteria. You enclose expressions in parentheses, which must be escaped from the shell. Here are some examples: Find all .html and .ws files in ~/Sites: $ find ~/Sites -name "*.ws" -or -name "*.html" Find files modified less than one day ago AND bigger than 5 MB: $ find . -mtime -1 -and -size +10240 Find files modified more than one day ago AND smaller than 5 MB: $ find . -mtime +1 -and -size -10240 Find files modified less than one day ago AND bigger than 5 MB: OR modified more than one day ago AND smaller than 5 MB: (The following command must be on one line, with a space between the first expression in parentheses and the -or operator.) $ find . \( -mtime -1 -and -size +10240 \) ¬ -or \( -mtime +1 -and -size -10240 \) Note the use of parentheses (escaped from the shell by backslash symbols) to ensure that the AND and OR expressions are evaluated in the correct order: The two ANDs will be evaluated; then their results will be ORed. Learn More
When primaries are grouped in an expression, find assumes by default that AND is the intended operator, so the expressions in this case can be shortened to $ find . \( -mtime -1 -size +10240 \) ¬ -or \( -mtime +1 -size -10240 \) Also, find evaluates AND operators before OR operators, allowing us to omit the parentheses too. $ find . -mtime -1 -size +10240 -or -mtime +1 -size -10240 Process Each FileYou may process the list of files returned by find in one of three ways:
It's important to be aware of these three different methods and how to realize each. The next three sections illustrate each technique using a simple example. Process the ListWe may want to process the list of filenames by sorting it alphabetically or filtering it with grep. To display each filename that includes the text hello, we can use the following: $ find ~ -iname "*.txt" | grep -i "hello" /Users/saruman/Documents/Letters/hello.txt Process the Files with findThe find command itself has a couple of primaries for processing files directly. The first is -ls to display file details, used in some of the examples above. The second is -delete to delete every file in a list. Naturally, caution is recommended with this one. Let's delete all files in our home directory that match the pattern *letter.txt. Dry-run the command first. $ find ~ -iname "*letter.txt" /Users/saruman/Documents/Letters/my_letter.txt Learn More
Now delete the files, and check what's left. $ find ~ -iname "*letter.txt" -delete $ find ~ -iname "*letter.txt" $ Process the Files with -exec and xargsIf we wish to search the contents of each file for the text hello, we must hand each filename off to an external command such as grep. In this example, grep is given a list of files to search. Compare this with the first example, in which the text of the list itself was searched by grep, not the contents of each file in the list. $ find ~ -iname "*.txt" | xargs grep "hello" /Users/saruman/Documents/Letters/letter-to-jan.txt:Hello Jan, |