2.1 Files

Files are central to Unix in ways that are not true for some other operating systems. Commands are executable files, usually stored in standard locations in the directory tree. System privileges and permissions are controlled in large part via access to files. Device I/O and file I/O are distinguished only at the lowest level. Even most interprocess communication occurs via file-like entities. Accordingly, the Unix view of files and its standard directory structure are among the first things a new administrator needs to know about.

Like all modern operating systems, Unix has a hierarchical (tree-structured) directory organization, know collectively as the filesystem .^[1] The base of this tree is a directory called the root directory. The root directory has the special name / (the forward slash character). On Unix systems, all user-available disk space is transparently combined into a single directory tree under /, and the physical disk a file resides on is not part of a Unix file specification. We'll discuss this topic in more detail later in this chapter.

^[1] Or file system the two forms refer to the same thing. To make things even more ambiguous, these terms are also used to refer to the collection of files on an individual formatted disk partition.

Access to files is organized around file ownership and protection. Security on a Unix system depends to a large extent on the interplay between the ownership and protection settings on its files and the system's user account and group^[2] structure (as well as factors like physical access to the machine). The following sections discuss the basic principles of Unix file ownership and protection.

^[2] On Unix systems, individual user accounts are organized into groups. Groups are simply collections of users, defined by the entries in /etc/passwd and /etc/group. The mechanics of defining groups and designating users as members of them are described in Chapter 6. Using groups effectively to enhance system security is discussed in Chapter 7.

2.1.1 File Ownership

Unix fileownership is a bit more complex than it is under some other operating systems. You are undoubtedly familiar with the basic concept of a file having an owner: typically, the user who created it and has control over it. On Unix systems, files have two owners: a user owner and a group owner. What is unusual about Unix file ownership is that these two owners are decoupled. A file's group ownership is independent of the user who owns it. In other words, although a file's group owner is often, perhaps even usually, the same as the group its user owner belongs to, this is not required. In fact, the user owner of a file does need not even need to be a member of the group that owns it. There is no necessary connection between them at all. In such a case, when file access is specified for a file's group owner, it applies to members of that group and not to other members of its user owner's group, who are treated simply as part of "other": the rest of the world.

The motivation behind thisgroup ownership of files is to allow file protections and permissions to be organized according to your needs. The key point here is flexibility. Because Unix lets users be in more than one group, you are free to create groups as you need them. Files can be made accessible to almost completely arbitrary collections of the system's users. Group file ownership means that giving someone access to an entire set of files and commands is as simple as adding her to the group that owns them; similarly, taking access away from someone else involves removing her from the relevant group.

To consider a more concrete example, suppose user chavez, who is in the chem group, needs access to some files usually used by the physics group. There are several ways you can give her access:

Make copies of the files for her. If they change, however, her copies will need to be updated. And if she needs to make changes too, it will be hard to avoid ending up with two versions that need to be merged together. (Because of inconveniences like these, this choice is seldom taken.)
Make the files world-readable. The disadvantage of this approach is that it opens up the possibility that someone you don't want to look at the files will see them.
Make chavez a member of the physics group. This is the best alternative and also the simplest. It involves changing only the group configuration file. The file permissions don't need to be modified at all, since they already allow access for physics group members.

2.1.1.1 Displaying file ownership

To display a file's user and group ownership, use the long form of the ls command by including the -l option (-lg under Solaris):

$ ls -l  -rwxr-xr-x  1 root     system      120   Mar 12 09:32  bronze  -r--r--r--  1 chavez   chem         84   Feb 28 21:43  gold  -rw-rw-r--  1 chavez   physics   12842   Oct 24 12:04  platinum  -rw-------  1 harvey   physics     512   Jan  2 16:10  silver

Columns three and four display the user and group owners for the listed files. For example, we can see that the file bronze is owned by user root and group system. The next two files are both owned by user chavez, but they have different group owners; gold is owned by group chem, while platinum is owned by group physics. The last file, silver, is owned by user harvey and group physics.

2.1.1.2 Who owns new files?

When a new file is created, its user owner is the user who creates it. On most Unix systems, the group owner is the current^[3] group of the user who creates the file. However, on BSD-style systems, the group owner is the same as the group owner of the directory in which the file is created. Of the versions we are considering, FreeBSD and Tru64 Unix operate in the second manner by default.

^[3] See Section 6.1 for information about how the user's primary group is determined.

Most current Unix versions, including all of those we are considering, allow a system to selectively use BSD-style group inheritance from the directory group ownership by setting the set group ID (setgid) attribute on the directory, which we discuss in more detail later in this chapter.

2.1.1.3 Changing file ownership

If you need to change the ownership of a file, use the chown and chgrp commands. The chown command changes the user owner of one or more files:

# chown new-owner files

where new-owner is the username (or user ID) of the new owner for the specified files. For example, to change the owner of the file brass to user harvey, execute this chown command:

# chown harvey brass

On most systems, only the superuser can run the chown command.

If you need to change the ownership of an entire directory tree, you can use the -R option (R for recursive). For example, the following command will change the user owner to harvey for the directory /home/iago/new/tgh and all files and subdirectories contained underneath it:

# chown -R harvey /home/iago/new/tgh

You can also change both the user and group owner in a single operation, using this format:

# chown new-owner:new-group files

For example, to change the user owner to chavez and the group owner to chem for chavez's home directory and all the files underneath it, use this command:

# chown -R chavez:chem /home/chavez

If you just want to change a file's group ownership, use the chgrp command:

$ chgrp new-group files

where new-group is the group name (or group ID) of the desired group owner for the specified files. chgrp also supports the -R option. Non-root users of chgrp must be both the owner of the file and a member of the new group to change a file's group ownership (but need not be a member of its current group).

2.1.2 File Protection

Once ownership is set up properly, the next natural issue to consider is how to protect files from unwanted access (or the reverse: how to allow access to those people who need it). The protection on a file is referred to as its file mode on Unix systems. File modes are set with the chmod command; we'll look at chmod after discussing the file protection concepts it relies on.

2.1.2.1 Types of file and directory access

Unix supports three types of file access: read, write, and execute, designated by the letters r, w, and x, respectively. Table 2-1 shows the meanings of those access types.

Table 2-1. File access types
Access	Meaning for a file	Meaning for a directory
r	View file contents.	Search directory contents (e.g., use `ls`).
w	Alter file contents.	Alter directory contents (e.g., delete or rename files).
x	Run executable file.	Make it your current directory (`cd` to it).

The file access types are fairly straightforward. If you have read access to a file, you can see what's in it. If you have write access, you can change what's in it. If you have execute access and the file is a binary executable program, you can run it. To run a script, you need both read and execute access, since the shell has to read the commands to interpret them. When you run a compiled program, the operating system loads it into memory for you and begins execution, so you don't need read access yourself.

The corresponding meanings for directories may seem strange at first, but they do make sense. If you have execute access to a directory, you can cd to it (or include it in a path that you want to cd to). You can also access files in the directory by name. However, to list all the files in the directory (i.e., to run the ls command without any arguments), you also need read access to the directory. This is consistent because a directory is just a file whose contents are the names of the files it contains, along with information pointing to their disk locations. Thus, to cd to a directory, you need only execute access since you don't need to be able to read the directory file itself. In contrast, if you want to run any command lists or use files in the directory via an explicit or implicit wildcard e.g., ls without arguments or cat *.dat you do need read access to the directory file itself to expand the wildcards.

Table 2-2 illustrates the workings of these various access types by listing some sample commands and the minimum access you would need to successfully execute them.

Table 2-2. File protection examples
	Minimum access needed
Command	On file itself	On directory file is in
`cd /home/chavez`	N/A	x
`ls /home/chavez/*.c`	(none)r	rx
`ls -l /home/chavez/*.c`	(none)r	rxx
`cat myfile`	r	x
`cat >>myfile`	w	x
runme (executable)	x	x
cleanup.sh (script)	rx	x
`rm myfile`	(none)	wx

Some items in this list are worth a second look. For example, when you don't have access to any of the component files, you still need only read access to a directory in order to do a simple ls; if you include -l (or any other option that lists file sizes), you also need execute access to the directory. This is because the file sizes must be determined from the disk information, an action which implicitly changes the directory in question. In general, any operation that involves more than simply reading the list of filenames from the directory file is going to require execute access if you don't have access to the relevant files themselves.

Note especially that write access on a file is not required to delete it; write access to the directory where the file resides is sufficient (although in this case, you'll be asked whether to override the protection on the file):

$ rm copper  rm: override protection 440 for copper? y

If you answer yes, the file will be deleted (the default response is no). Why does this work? Because deleting a file actually means removing its entry from the directory file (among other things), which is a form of altering the directory file, for which you need only write access to the directory. The moral is that write access to directories is very powerful and should be granted with care.

Given these considerations, we can summarize the different options for protecting directories as shown in Table 2-3.

Table 2-3. Directory protection summary
Access granted	Resulting availability
-- - (no access)	Does not allow any activity of any kind within the directory or any of its subdirectories.
r -- (read access only)	Allows users to list the names of the files in the directory, but does not reveal any of their attributes (i.e., size, ownership, mode, and so on).
-- x (execute access only)	Lets users work with programs in the directory specified by full pathname, but hides all other files.
r-x (read and execute access)	Lets users work with programs in the directory and list the contents of the directory, but does not allow them to create or delete files in the directory.
-wx (write and execute access)	Used for a drop-box directory. Users can change to the directory and leave files there, but can't discover the names of files placed there by others. The sticky bit is also usually set on such directories (see below).
rwx (full access)	Lets users work with programs in the directory, look at the contents of the directory, and create or delete files in the directory.

2.1.2.2 Access classes

Unix defines three basic classes of file access for which protection may be specified separately:

User access (u): Access granted to the owner of the file.
Group access (g): Access granted to members of the same group as the group owner of the file (but does not apply to the owner himself, even if he is a member of this group).
Other access (o): Access granted to all other normal users.

Unix file protection specifies the access types available to members of each of the three access classes for the file or directory.

The long version of the ls command also displays file permissions in addition to user and group ownership:

$ ls -l  -rwxr-xr-x  1 root     system     120 Mar 12 09:32  bronze  -r--r--r--  1 chavez   chem        84 Feb 28 21:43  gold  -rw-rw-r--  1 chavez   physics  12842 Oct 24 12:04  platinum

The set of letters and hyphens at the beginning of each line represents the file's mode. The 10 characters are interpreted as indicated in Table 2-4.

Table 2-4. Interpreting mode strings
		User access			Group access			Other access
File	type1	read2	write3	exec4	read5	write6	exec7	read8	write9	exec10
bronze	-	r	w	x	r	-	x	r	-	x
gold	-	r	-	-	r	-	-	r	-	-
platinum	-	r	w	-	r	w	-	r	-	-
/etc/passwd	-	r	w	-	r	-	-	r	-	-
/etc/shadow	-	r	-	-	-	-	-	-	-	-
/etc/inittab	-	r	w	-	r	w	-	r	-	-
/bin/sh	-	r	-	x	r	-	x	r	-	x
/tmp	d	r	w	x	r	w	x	r	w	t

The first character indicates the file type: a hyphen indicates a plain file, and a d indicates a directory (other possibilities are discussed later in this chapter). The remaining nine characters are arranged in three groups of three. Moving from left to right, the groups represent user, group, and other access. Within each group, the first character denotes read access, the second character write access, and the third character execute access. If a certain type of access is allowed, its code letter appears in the proper position within the triad; if it is not granted, a hyphen appears instead.

For example, in the previous listing, read access and no other is granted for all users on the file gold. On the file bronze, the owner in this case, root is allowed read, write, and execute access, while all other users are allowed only read and execute access. Finally, for the file platinum, the owner (chavez) and all members of the group physics are allowed read and write access, while everyone else is granted only read access.

The remaining entries in Table 2-4 (below the line) are additional examples illustrating the usual protections for various common system files.

2.1.2.3 Setting file protection

The chmod command is used to specify the access mode for files:

$ chmod access-string files

chmod's second argument is an access string, which states the permissions you want to set (or remove) for the listed files. It has three parts: the code for one or more access classes, the operator, and the code for one or more access types.

Figure 2-1 illustrates the structure of an access string. To create an access string, you choose one or more codes from the access class column, one operator from the middle column, and one or more access types from the third column. Then you concatenate them into a single string (no spaces). For example, the access string u+w says to add write access for the user owner of the file. Thus, to add write access for yourself for a file you own (lead, for example), use:

$ chmod u+w lead

To add write access for everybody, use the all access class:

$ chmod a+w lead

To remove write access, use a minus sign instead of a plus sign:

$ chmod a-w lead

This command sets the permissions on the file lead to allow only read access for all users:

$ chmod a=r lead

If execute or write access had previously been set for any access class, executing this command removes it.

Figure 2-1. Constructing an access string for chmod

You can specify more than one access type and more than one access class. For example, the access string g-rw says to remove read and write access from the group access. The access string go=r says to set the group and other access to read-only (no execute access, no write access), changing the current setting as needed. And the access string go+rx says to add both read and execute access for both group and other users.

You can also include more than one set of operation-access type pairs for any given access class specification. For example, the access string u+x-w adds execute access and removes write access for the user owner. You can combine multiple access strings by separating them with commas (no spaces between them). Thus, the following command adds write access for the file owner and removes write access and adds read access for the group and other classes for the files bronze and brass:

$ chmod u+w,og+r-w bronze brass

The chmod command supports a recursive option (-R), to change the mode of a directory and all files under it. For example, if user chavez wants to protect all the files under her home directory from everyone else, she can use the command:

$ chmod -R go-rwx /home/chavez

2.1.2.4 Beyond the basics

So far, this discussion has undoubtedly made chmod seem more rigid than it actually is. In reality, it is a very flexible command. For example, both the access class and the access type may be omitted under some circumstances.

When the access class is omitted, it defaults to a. For example, the following command grants read access to all users for the current directory and every file under it:

$ chmod -R +r .

On some systems, this form operates slightly differently than a chmod a+r command. When the a access class is omitted, the specified permissions are compared against the default permissions currently in effect (i.e., as specified by the umask). When there is disagreement between them, the current default permissions take precedence. We'll look at this in more detail when we consider the umask a bit later.

The access string may be omitted altogether when using the = operator; this form has the effect of removing all access. For example, this command prevents any access to the file lead by anyone other than its owner:

$ chmod go= lead

Similarly, the form chmod = may be used to remove all access from a file (subject to constraints on some systems, to be discussed shortly).

The X access type grants execute access to the specified access classes only when execute access is already set for some access class. A typical use for this access type is to grant group or other read and execute access to all the directories and executable files within a subtree while granting only read access to all other types of files (the first group will all presumably have user execute access set). For example:

$ ls -lF  -rw-------   1 chavez chem609 Nov 29 14:31 data_file.txt  drwx------   2 chavez chem512 Nov 29 18:23 more_stuff/  -rwx------   1 chavez chem161 Nov 29 18:23 run_me*  $ chmod go+rX *  $ ls -lF  -rw-r--r--   1 chavez chem609 Nov 29 14:31 data_file.txt  drwxr-xr-x   2 chavez chem512 Nov 29 18:23 more_stuff/  -rwxr-xr-x   1 chavez chem161 Nov 29 18:23 run_me*

By specifying X, we avoid making data_file.txt executable, which would be a mistake.

chmod also supports the u, g, and o access types, which may be used as a shorthand form for the corresponding class's current settings (determined separately for each specified file). For example, this command makes the other access the same as the current group access for each file in the current directory:

$ chmod o=g *

If you like thinking in octal, or if you've been around Unix a long time, you may find numeric modes more convenient than incantations like go+rX. Numeric modes are described in the next section.

2.1.2.5 Specifying numeric file modes

The method just described for specifying file modes uses symbolic modes, since code letters are used to refer to each access class and type. The mode may also be set as an absolute mode by converting the symbolic representation used by ls to a numeric form. Each access triad (for a different user class) is converted to a single digit by setting each individual character in the triad to 1 or 0, depending on whether that type of access is permitted or not, and then taking the resulting three-digit binary number and converting it to an integer (which will be between 0 and 7). Here is a sample conversion:

	user			group			other
Mode	r	w	x	r	-	x	r	-	-
Convert to binary	1	1	1	1	0	1	1	0	0
Convert to octal digit	7			5			4
Corresponding absolute mode	754

To set the protection on a file to match those above, you specify thenumeric file mode 754 to chmod as the access string:

$ chmod 754 pewter

2.1.2.6 Specifying the default file mode

You can use the umask command to specif y the default mode for newly created files. Its argument is a three-digit numeric mode that represents the access to be inhibited masked out when a file is created. Thus, the value is the octal complement of the desired numeric file mode.

If masks confuse, you can compute the umask value by subtracting the numeric access mode you want to assign from 777. For example, to obtain the mode 754 by default, compute 777 - 754 = 023; this is the value you give to umask:

$ umask 023

Note that leading zeros are included to make the mask three digits long.

Once this command is executed, all future files created are given this protection automatically. You usually put a umask command in the system-wide login initialization file and in the individual login initialization files you give to users when you create their accounts (see Chapter 6).

As we mentioned earlier, the chmod command's actions are affected by the default permissions when no explicit access class is specified, as in this example:

% chmod +rx *

In such cases, the current umask is taken into account before the file access mode is changed. More specifically, an individual access permission is not changed unless the umask allows it to be set.

It takes a concrete example to fully appreciate this aspect of chmod:

$ umask          Displays the current value. 23  $ ls -l gold silver ----------   1 chavez  chem      609 Oct 24 14:31  gold  -rwxrwxrwx   1 chavez  chem    12874 Oct 22 23:14  silver  $ chmod +rwx gold  $ chmod -rwx silver  $ ls -l gold silver  -rwxr-xr--   1 chavez  chem      609 Nov 12 09:04  gold  -----w--wx   1 chavez  chem    12874 Nov 12 09:04  silver

The current umask of 023 allows all access for the user, read and execute access for the group, and read-only access for other users. Thus, the first chmod command acts as one would expect, setting access in accordance with what is allowed by the umask. However, the interaction between the current umask and chmod's "-" operator may seem somewhat bizarre. The second chmod command clears only those access bits that are permitted by the umask; in this case, write access for group and write and execute access for other remain turned on.

2.1.2.7 Special-purpose access modes

The simple file access modes described previously do not exhaust the Unix possibilities. Table 2-5 lists the other defined file modes.

Table 2-5. Special-purpose access modes
Code	Name	Meaning
t	save text mode, sticky bit	Files: Keep executable in memory after exit.Directories: Restrict deletions to each user's own files.
s	setuid bit	Files: Set process user ID on execution.
s	setgid bit	Files: Set process group ID on execution.Directories: New files inherit directory group owner.
l	file locking	Files: Set mandatory file locking on reads/writes (Solaris and Tru64 and sometimes Linux). This mode is set via the group access type and requires that group execute access is off. Displayed as S in `ls -l` listings.

The t access type turns on the sticky bit (the formal name is save text mode, which is where the t comes from). For files, this traditionally told the Unix operating system to keep an executable image in memory even after the process that was using it had exited. This feature is seldom implemented in current Unix implementations. It was designed to minimize startup overhead for frequently used programs like vi. We'll consider the sticky bit on directories below.

When the set user ID (setuid) or set group ID (setgid) access mode is set on an executable file, processes that run it are granted access to system resources based upon the file's user or group owner, rather than based on the user who created the process. We'll consider these access modes in detail later in this chapter.

2.1.2.8 Save-text access on directories

The sticky bit has a different meaning when it is set on directories. If the sticky bit is set on a directory, a user may only delete files that she owns or for which she has explicit write permission granted, even when she has write access to the directory (thus overriding the default Unix behavior). This feature is designed to be used with directories like /tmp, which are world-writable, but in which it may not be desirable to allow any user to delete files at will.

The sticky bit is set using the user access class. For example, to turn on the sticky bit on /tmp, use this command:

# chmod u+t /tmp

Oddly, Unix displays the sticky bit as a "t" in the other execute access slot in long directory listings:

$ ls -ld /tmp  drwxrwxrwt   2 root         8704  Mar 21 00:37  /tmp

2.1.2.9 Setgid access on directories

Setgid access on a directory has a special meaning. When this mode is set, it means that files created in that directory will have the same group ownership as the directory itself (rather than the user owner's primary group), emulating the default behavior on BSD-based systems (FreeBSD and Tru64). This approach is useful when you have groups of users who need to share a lot of files. Having them work from a common directory with the setgid attribute means that correct group ownership will be automatically set for new files, even if the people in the group don't share the same primary group.

To place setgid access on a directory, use a command like this one:

# chmod g+s /pub/chem2

2.1.2.10 Numerical equivalents for special access modes

The special access modes can also be set numerically. They are set via an additional octal digit prepended to the mode whose bits correspond to the sticky bit (lowest bit: 1), setgid/file locking (middle bit: 2), and setuid (high bit: 4). Here are some examples:

# chmod 4755 uid       Setuid access # chmod 2755 gid       Setgid access # chmod 6755 both      Setuid and setgid access: 2 highest bits on # chmod 1777 sticky    Sticky bit # chmod 2745 locking   File locking (note that group execute is off) # ls -ld -rwsr-sr-x   1 root  chem           0 Mar 30 11:37 both -rwxr-sr-x   1 root  chem           0 Mar 30 11:37 gid -rwxr-Sr-x   1 root  chem           0 Mar 30 11:37 locking drwxrwxrwt   2 root  chem        8192 Mar 30 11:39 sticky -rwsr-xr-x   1 root  chem           0 Mar 30 11:37 uid

2.1.3 How to Recognize a File Access Problem

My first rule of thumb about any user problem that comes up is this: it's usually a file ownership or protection problem.^[4] Seriously, though, the majority of the problems users encounter that aren't the result of hardware problems really are file access problems. One classic tip-off of a file protection problem is something that worked yesterday, or last week, or even last year, but doesn't today. Another clue is that something works differently for root than it does for other users.

^[4] At least, this was the case before the Internet.

In order to work properly, programs and commands must have access to the input and output files they use, any scratch areas they access, and any permanent files they rely on, including the special files in /dev (which act as device interfaces).

When such a problem arises, it can come from either the file permissions being wrong or the protection being correct but the ownership (user and/or group) being wrong.

The trickiest problem of this sort I've ever seen was at a customer site where I was conducting a user training course. Suddenly, their main text editor, which happened to be a clone of the VAX/VMS editor EDT, just stopped working. It seemed to start up fine, but then it would bomb out when it got to its initialization file. But the editor worked without a hitch when root ran it. The system administrator admitted to "changing a few things" the previous weekend but didn't remember exactly what. I checked the protections on everything I could think of, but found nothing. I even checked the special files corresponding to the physical disks in /dev. My company ultimately had to send out a debugging version of the editor, and the culprit turned out to be /dev/null, which the system administrator had decided needed protecting against random users!

There are at least three morals to this story:

For the local administrator: always test every change before going on to the next one multiple, random changes almost always wreak havoc. Writing them down as you do them also makes troubleshooting easier.
For me: if you know it's a protection problem, check the permissions on everything.
For the programmer who wrote the editor: always check the return value of system calls (but that's another book).

If you suspect a file protection problem, try running the command or program as root. If it works fine, it's almost certainly a protection problem.

A common, inadvertent way of creating file ownership problems is by accidentally editing files as root. When you save the file, the file's owner is changed by some editors. The most obscure variation on this effect that I've heard of is this: someone was editing a file as root using an editor that automatically creates backup files whenever the edited file is saved. Creating a backup file meant writing a new file to the directory holding the original file. This caused the ownership on the directory to be set to root.^[5] Since this happened in the directory used by UUCP (the Unix-to-Unix copy facility), and correct file and directory ownership are crucial for UUCP to function, what at first seemed to be an innocuous change to an inconsequential file broke an entire Unix subsystem. Running chown uucp on the directory fixed everything again.

^[5] Clearly, the system itself was somewhat "broken" as well, since adding a file to a directory should never change the directory's ownership. However, it is also possible to do this accidentally with text editors that allow you to edit a directory.

2.1.4 Mapping Files to Disks

This section will change our focus fromfiles as objects to files as collections of data on disk. Users need not be aware of the actual disk locations of files they access, but administrators need to have at least a basic conception of how Unix maps files to disk blocks in order to understand the different file types and the purpose and functioning of the various filesystem commands.

An inode (pronounced "eye-node") is the data structure on disk that describes and stores a file's attributes, including its physical location on disk. When a filesystem is initially created, a specific number of inodes are created. In most cases, this becomes the maximum number of files of all types, including directories, special files, and links (discussed later) that can exist in the filesystem. A typical formula is one inode for every 8 KB of actual file storage. This is more than sufficient in most situations.^[6] Inodes are given unique numbers, and each distinct file has its own inode. When a new file is created, an unused inode is assigned to it.

^[6] There are a couple of circumstances where this may not hold. One is a filesystem containing an enormous number of very small files. The traditional example of this is the USENET news spool directory tree (although some modern news servers now use a better storage scheme). News files are typically both very small and inordinately numerous, and their numbers have been known to exceed normal inode limits. A second potential problem situation occurs with facilities that make extensive use of symbolic links for functions such as source code version control, again characterized by many, many tiny files. In such cases, you can run out of inodes before disk capacity is exhausted. You will want to take these factors into account when preparing the disk (see Chapter 10). At the other extreme, filesystems that are designed to hold only a few very large files might save a nontrivial amount of space by being configured with far fewer than the normal number of inodes.

Information stored in inodes includes the following:

User owner and group owner IDs.
File type (regular, directory, etc., or 0 if the inode is unused).
Access modes (permissions).
Most recent inode modification, data access, and data modification times. If the file'smetadata does not change, the first item will correspond to the file creation time.
Number of hard links to the file (links are discussed later in this chapter). This is 0 if the inode is unused, and one for most regular files.
Size of the file.
Disk addresses of:
- Disk locations for the data blocks that make up the file, and/or
- Disk locations of disk blocks that hold the disk locations of the file's data blocks (indirect blocks), and/or
- Disk locations of disk blocks that hold the disk locations of indirect blocks (double indirect blocks: two disk addresses removed from the actual data blocks).^[7]
  
  ^[7] In traditional System V filesystems, inode disk addresses can point to triple indirect blocks. FreeBSD also uses triple indirect blocks.

In short, inodes store all available information about the file except its name and directory location. The inodes themselves are stored elsewhere on disk.

On Unix systems, it is reasonably safe to say that "everything is a file": the operating system even represents I/O devices as files. Accordingly, there are several different kinds of files, each with a different function.

2.1.4.1 Regular files

Regular files are files containing data. They are normally called simply "files." These may be ASCII text files, binary data files, executable program binaries, program input or output, and so on.

2.1.4.2 Directories

A directory is a binary file consisting of a list of the other files it contains, possibly including otherdirectories (try running od -c on one to see this). Directory entries are filename-inode number pairs. This is the mechanism by which inodes and directory locations are associated; the data on disk has no knowledge of its (purely logical) location within its filesystem.

2.1.4.3 Special files: character and block device files

Special files are the mechanism used for device I/O under Unix. They reside in the directory /dev and its subdirectories, as well as the directory /devices under Solaris.

Generally, there are two types of special files: character special files, corresponding to character-based or raw device access, and block special files, corresponding to block I/O device access. Character special files are used for unbuffered data transfers to and from a device (e.g., a terminal). In contrast, block special files are used when data is transferred in fixed-size chunks known as blocks (e.g., most file I/O). Both kinds of special files exist for some devices (including disks). Character special files generally have names beginning with r (for "raw") /dev/rsd0a, for example or reside in subdirectories of /dev whose names begin with r -- /dev/rdsk/c0t3d0s7, for example. The corresponding block special files have the same name, minus the initial r: /dev/disk0a, /dev/dsk/c0t3d0s7. Special files are discussed in more detail in later in this chapter.

2.1.4.4 Links

A link is a mechanism that allows several filenames (actually, directory entries) to refer to a single file on disk. There are two kinds of links: hard links and symbolic or soft links. A hard link associates two (or more) filenames with the same inode. Hard links are separate directory entries that all share the same disk data blocks. For example, the command:

$ ln index hlink

creates an entry in the current directory named hlink with the same inode number as index, and the link count in the corresponding inode is increased by 1. Hard links may not span filesystems, because inode numbers are unique only within a filesystem. In addition, hard links should be used only for files and not for directories, and correctly implemented versions of ln won't let you create the latter.

Symbolic links, on the other hand, are pointer files that refer to a different file or directory elsewhere in the filesystem. Symbolic links may span filesystems, because they point to a Unix pathname, not to a specific inode.

Symbolic links are created with the -s option to ln.

The two types of links behave similarly, but they are not identical. As an example, consider a file index to which there is a hard link hlink and a symbolic link slink. Listing the contents using either name with a command like cat will result in the same output. For both index and hlink, the disk contents pointed to by the addresses in their common inode will be accessed and displayed. For slink, the disk contents referenced by the address in its inode contain the pathname for index; when it is followed, index's inode will be accessed next, and finally its data blocks will be displayed.

In directory listings, hlink will be indistinguishable from index. Changes made to either file will affect both of them, since they share the same disk blocks. However, moving either file with the mv command will not affect the other one, since moving a file involves only altering a directory entry (keep in mind that pathnames are not stored in the inode). Similarly, deleting index will not affect hlink, which will still point to the same inode (the corresponding disk blocks are only freed when an inode's link count reaches zero).

If a new file in the current directory named index is subsequently created, there will be no connection between it and hlink, because when the new file is created, it will be assigned a free inode. Although they are initially created by referencing an existing file, hard links are linked only to an inode, not to the other file. In fact, all regular files are technically hard links (i.e., inodes with a link count 1).

In contrast, a symbolic link slink to index will behave differently. The symbolic link appears as a separate entry in directory listings,marked as a link with an "l" as the first character in the mode string:

% ls -l  -rw------- 2 chavez  chem  5228 Mar 12 11:36 index  -rw------- 2 chavez  chem  5228 Mar 12 11:36 hlink  lrwxrwxrwx 1 chavez  chem     5 Mar 12 11:37 slink -> index

Symbolic links are always very small files, while every hard link to a given file (inode) is exactly the same size (hlink is naturally the same length as index).

Changes made by referencing either the real filename or the symbolic link will affect the contents of index. Deleting index will also break the symbolic link; slink will point nowhere. But if another file index is subsequently recreated, slink will once again be linked to it.^[8] Deleting slink will have no effect on index.

^[8] Symbolic links are actually interpreted only when accessed, so they can't really be said to point anywhere at other times. But conceptually, this is what they do.

Figure 2-2 illustrates the differences between hard and symbolic links. In the first picture, index and hlink share the inode N1 and its associated data blocks. The symbolic link slink has a different inode, N2, and therefore different data blocks. The contents of inode N2's data blocks refer to the pathname to index.^[9] Thus, accessing slink eventually reaches the data blocks for inode N1.

^[9] Some operating systems, including FreeBSD, store the target of the symbolic link in the inode itself, provided the target is short enough.

Figure 2-2. Comparing hard and symbolic links

When index is deleted (in the second picture), hlink is associated with inode N1 by its own directory entry. Accessing slink will generate an error, however, since the pathname it references does not exist. When a new index is created (in the third picture), its gets a new inode, N3. This new file clearly has no relationship to hlink, but it does act as the target for slink.

Using the cd command can be a bit tricky when dealing with symbolic links to directories, as these examples illustrate:

$ pwd; cd ./htdocs /home/chavez $ cd ../bin ../bin: No such file or directory. $ pwd /public/web2/apache/htdocs $ ls -l /home/chavez/htdocs lrwxrwxrwx   1 chavez chem   18 Mar 30 12:06 htdocs ->                               /public/web/apache/htdocs

The subdirectory htdocs in the current directory is a symbolic link (its target is indicated in the final command). Accordingly, the second cd command does not work as expected, and the current directory does not change to /home/chavez/bin. Similar effects would occur with a command like this one:

$ cd /home/chavez/htdocs/../cgi-bin; pwd /public/web2/apache/cgi-bin

For more information about links, see the ln manual page, and experiment with creating and modifying linked files.

2.1.4.4.1 Tru64 Context-Dependent Symbolic Links

In a Tru64 clustered environment, many standard system files and directories are actually a type of symbolic link known as context-dependent symbolic links (CDSLs). They are symbolic links with a variable component that is resolved to a specific cluster host at access time. For example, consider this directory listing (the output is wrapped to fit):

$ ls -lF /var/adm/c* -rw-r--r--   1 root     system  91 May 30 13:07  cdsl_admin.inv -rw-r--r--   1 root     adm    232 May 30 13:07  cdsl_check_list lrwxr-xr-x   1 root     adm     43 Jan  3 12:09  collect.dated@ ->                          ../cluster/members/{memb}/adm/collect.dated lrwxr-xr-x   1 root     adm     35 Jan  3 12:04  crash@         ->                          ../cluster/members/{memb}/adm/crash/ lrwxr-xr-x   1 root     adm     34 Jan  3 12:04  cron@          ->                          ../cluster/members/{memb}/adm/cron/

The first two files are regular files that reside in the /var/adm directory. The remaining three files are context-dependent symbolic links, indicated by the {memb} component. When such a file is accessed, this component is resolved to a directory named membern, where n indicates the host's number within the cluster.

Occasionally, you may need to create such a link. The mkcdsl command serves this purpose, as in this example (output is wrapped):

# cd /var/adm # mkcdsl pacct # ls -l pacct lrwxr-xr-x   1 root     adm     43 Jan  3 12:09  pacct ->                    ../cluster/members/{memb}/adm/pacct

The ln -s command may also be used to create context-dependent symbolic links:

# ln -s "../cluster/members/{memb}/adm/pacct" ./pacct

The cdslinvchk -verify command may be used to verify that all expected CDSLs are present on a system. It reports its findings to the file /var/adm/cdsl_check_list. Here is some sample output (wrapped to fit):

Expected CDSL: ./usr/var/X11/Xserver.conf ->   ../cluster/members/{memb}/X11/Xserver.conf An administrator or application has replaced this CDSL with: -rw-r--r-- 1 root system 4545   Jan 3 12:41                                      /usr/var/X11/Xserver.conf

This report indicates that there is one missing CDSL.

2.1.4.5 Sockets

A socket, whose official name is a Unix domain socket, is a special type of file used for communications between processes. A socket may be thought of as a communications end point, tied to a particular local system port, to which processes may attach. For example, on a BSD-style system, the socket /dev/printer is used by processes to send messages to the program lpd (the line-printer spooling daemon), informing it that it has work to do.

2.1.4.6 Named pipes

Named pipes are pipes opened by applications for interprocess communication (they are "named" in the sense that applications refer to them by their pathname). They are a System V feature that has migrated to all versions of Unix. Named pipes often reside in the /dev directory. They are also known as FIFOs (for "first-in, first-out").

2.1.4.7 Using ls to identify file types

The long directory listing (produced by the ls -l command) identifies the type of each file it lists via the initial character of the permissions string:

`-`	Plain file (hard link)
`d`	Directory
`l`	Symbolic link
`b`	Block special file
`c`	Character special file
`s`	Socket
`p`	Named pipe

For example, the following ls -l output includes each of the file types discussed above, in the same order:

-rw------- 2 chavez  chem     28 Mar 12 11:36  gold.dat  -rw------- 2 chavez  chem     28 Mar 12 11:36  hlink.dat  drwx------ 2 chavez  chem    512 Mar 12 11:36  old_data  lrwxrwxrwx 1 chavez  chem      8 Mar 12 11:37  zn.dat -> gold.dat  brw-r----- 1 root    system    0 Mar  2 15:02  /dev/sd0a  crw-r----- 1 root    system    0 Jun 12  1989  /dev/rsd0a  srw-rw-rw- 1 root    system    0 Mar 11 08:19  /dev/log  prw------- 1 root    system    0 Mar 11 08:32  /usr/lib/cron/FIFO

Note that the -l option also displays the target file for symbolic links (following the -> symbol).

ls has other options to make identifying file types easy. On many systems, the -F option will append a special character to each filename, indicating its type:

-rw------- 2 chavez  chem     28 Mar 12 11:36  gold.dat  -rw------- 2 chavez  chem     28 Mar 12 11:36  hlink.dat  drwx------ 2 chavez  chem    512 Mar 12 11:36  old_data/  -rwxr-x--- 1 chavez  chem  23478 Feb 23 09:45  test_prog*  lrwxrwxrwx 1 chavez  chem      8 Mar 12 11:37  zn.dat@ -> gold.dat  srw-rw-rw- 1 root    system    0 Mar 11 08:19  /dev/log=  prw------- 1 root    system    0 Mar 11 08:32  /usr/lib/cron/FIFO|

Note than an asterisk indicates an executable file (program or script). Some versions of ls also support a -o option, which color-codes filenames in the output based on their file type.

You can use the -i option to ls to determine the equivalent file in the case of hard links. Using -i tells ls to display the inode number associated with each filename. Here is an example:

$ ls -i /dev/rmt0 /dev/rmt/*  290 /dev/rmt0 293 /dev/rmt/c0d6ln  292 /dev/rmt/c0d6h291 /dev/rmt/c0d6m  295 /dev/rmt/c0d6hn294 /dev/rmt/c0d6mn  290 /dev/rmt/c0d6l

From this display, we can determine that the special files /dev/rmt0 (the default tape drive for many commands, including tar) and /dev/rmt/c0d6l are equivalent, because they both reference inode number 290.

ls can't distinguish between text and binary files (both are "regular" files). You can use the file command to do so. Here is an example:

# file *  appoint: ... executable not stripped  bin: directory  clean: symbolic link to bin/clean  fort.1: empty  gold.dat: ascii text  intro.ms:   [nt]roff, tbl, or eqn input text  run_me.sh: commands text  xray.c: ascii text

The file appoint is an executable image; the additional information provided for such files differs from system to system. Note that file tries to figure out what the contents of ASCII files are, with varying success.