5.4 Hard Links and Symbolic Links | UNIX Systems Programming: Communication, Concurrency and Threads

Team-FLY

UNIX directories have two types of links ”links and symbolic links. A link, sometimes called a hard link , is a directory entry. Recall that a directory entry associates a filename with a file location. A symbolic link , sometimes called a soft link , is a file that stores a string used to modify the pathname when it is encountered during pathname resolution. The behavioral differences between hard and soft links in practice is often not intuitively obvious. For simplicity and concreteness, we assume an inode representation of the files. However, the discussion applies to other file implementations .

A directory entry corresponds to a single link, but an inode may be the target of several of these links. Each inode contains the count of the number of links to the inode (i.e., the total number of directory entries that contain the inode number). When a program uses open to create a file, the operating system makes a new directory entry and assigns a free inode to represent the newly created file.

Figure 5.4 shows a directory entry for a file called name1 in the directory /dirA . The file uses inode 12345. The inode has one link, and the first data block is block 23567. Since the file is small, all the file data is contained in this one block, which is represented by the short text in the figure.

Figure 5.4. Directory entry, inode and data block for a simple file.

graphics/05fig04.gif

5.4.1 Creating or removing a link

You can create additional links to a file with the ln shell commandor the link function. The creation of the new link allocates a new directory entry and increments the link count of the corresponding inode. The link uses no other additional disk space.

When you delete a file by executing the rm shell command or by calling the unlink function from a program, the operating system deletes the corresponding directory entry and decrements the link count in the inode. It does not free the inode and the corresponding data blocks unless the operation causes the link count to be decremented to 0.

The link function creates a new directory entry for the existing file specified by path1 in the directory specified by path2 .

  SYNOPSIS  #include <unistd.h>    int link(const char *path1, const char *path2);  POSIX

If successful, the link function returns 0. If unsuccessful , link returns “1 and sets errno . The following table lists the mandatory errors for link .

`errno`	cause
`EACCES`	search permission on a prefix of `path1` or `path2` denied, or link requires writing in a directory with write permission denied , or process does not have required access permission for file
`EEXIST`	`path2` resolves to a symbolic link or to an existing file
`ELOOP`	a loop exists in resolution of `path1` or `path2`
`EMLINK`	number of links to file specified by `path1` would exceed `LINK_MAX`
`ENAMETOOLONG`	the length of `path1` or `path2` exceeds `PATH_MAX` , or a pathname component is longer than `NAME_MAX`
`ENOENT`	a component of either path prefix does not exist, or file named by `path1` does not exist, or `path1` or `path2` points to an empty string
`ENOSPC`	directory to contain the link cannot be extended
`ENOTDIR`	a component of either path prefix is not a directory
`EPERM`	file named by `path1` is a directory and either calling process does not have privileges or implementation does not allow `link` for directories
`EROFS`	`link` would require writing in a read-only file system
`EXDEV`	link named by `path2` and file named by `path1` are on different file systems, and implementation does not support links between file systems

Example 5.15

The following shell command creates an entry called name2 in dirB containing a pointer to the same inode as /dirA/name1 .

 ln /dirA/name1 /dirB/name2

The result is shown in Figure 5.5.

Example 5.16

The following code segment performs the same action as the ln shell command of Example 5.15.

 #include <stdio.h> #include <unistd.h> if (link("/dirA/name1", "/dirB/name2") == -1)    perror("Failed to make a new link in /dirB");

Figure 5.4 shows a schematic of /dirA/name1 before the ln command of Example 5.15 or the link function of Example 5.16 executes. Figure 5.5 shows the result of linking.

Figure 5.5. Two hard links to the same file shown in Figure 5.4.

graphics/05fig05.gif

The ln command (or link function) creates a link (directory entry) that refers to the same inode as dirA/name1 . No additional disk space is required, except possibly if the new directory entry increases the number of data blocks needed to hold the directory information. The inode now has two links.

The unlink function removes the directory entry specified by path . If the file's link count is 0 and no process has the file open, the unlink frees the space occupied by the file.

  SYNOPSIS  #include <unistd.h>    int unlink(const char *path);  POSIX

If successful, the unlink function returns 0. If unsuccessful, unlink returns “1 and sets errno . The following table lists the mandatory errors for unlink .

`errno`	cause
`EACCES`	search permission on a component of the path prefix is denied, or write permission is denied for directory containing directory entry to be removed
`EBUSY`	file named by `path` cannot be unlinked because it is in use and the implementation considers this an error
`ELOOP`	a loop exists in resolution of `path`
`ENAMETOOLONG`	the length of `path` exceeds `PATH_MAX` , or a pathname component is longer than `NAME_MAX`
`ENOENT`	a component of `path` does not name an existing file, or `path` is an empty string
`ENOTDIR`	a component of the path prefix is not a directory
`EPERM`	file named by `path` is a directory and either the calling process does not have privileges or implementation does not allow `unlink` for directories
`EROFS`	`unlink` would require writing in a read-only file system

Exercise 5.17

The following sequence of operations might be performed by a text editor when editing the file /dirA/name1 .

Open the file /dirA/name1 .

Read the entire file into memory.

Close /dirA/name1 .

Modify the memory image of the file.

Unlink /dirA/name1 .

Open the file /dirA/name1 (create and write flags).

Write the contents of memory to the file.

Close /dirA/name1 .

How would Figures 5.4 and 5.5 be modified if you executed this sequence of operations on each configuration?

Answer:

After these operations were applied to Figure 5.4, the new file would have the same name as the old but would have the new contents. It might use a different inode number and block. This is what we would expect. When the text editor applies the same set of operations to the configuration of Figure 5.5, unlinking removes the directory entry for /dirA/name1 . The unlink reduces the link count but does not delete the file, since the link /dirB/name2 is still pointing to it. When the editor opens the file /dirA/name1 with the create flag set, a new directory entry and new inode are created. We now have /dirA/name1 referring to the new file and /dirB/name2 referring to the old file. Figure 5.6 shows the final result.

Figure 5.6. Situation after a text editor changes a file. The original file had inode 12345 and two hard links before editing (i.e., the configuration of Figure 5.5).

graphics/05fig06.gif

Exercise 5.18

Some editors back up the old file. One possible way of doing this is with the following sequence of operations.

Open the file /dirA/name1 .

Read the entire file into memory.

Close /dirA/name1 .

Modify the memory image of the file.

Rename the file /dirA/name1 /dirA/name1.bak .

Open the file /dirA/name1 (create and write flags).

Write the contents of memory to the file.

Close /dirA/name1 .

Describe how this strategy affects each of Figures 5.4 and 5.5.

Answer:

Starting with the configuration of Figure 5.4 produces two distinct files. The file /dirA/name1 has the new contents and uses a new inode. The file /dirA/name1.bak has the old contents and uses the old inode. For the configuration of Figure 5.5, /dirA/name1.bak and /dirB/name2 point to the old contents using the old inode. The second open creates a new inode for dirA/name1 , resulting in the configuration of Figure 5.7.

Figure 5.7. Situation after one file is changed with an editor that makes a backup copy.

graphics/05fig07.gif

The behavior illustrated in Exercises 5.17 and 5.18 may be undesirable. An alternative approach would be to have both /dirA/name1 and /dirB/name2 reference the new file. In Exercise 5.22 we explore an alternative sequence of operations that an editor can use.

5.4.2 Creating and removing symbolic links

A symbolic link is a file containing the name of another file or directory. A reference to the name of a symbolic link causes the operating system to locate the inode corresponding to that link. The operating system assumes that the data blocks of the corresponding inode contain another pathname. The operating system then locates the directory entry for that pathname and continues to follow the chain until it finally encounters a hard link and a real file. The system gives up after a while if it doesn't find a real file, returning the ELOOP error.

Create a symbolic link by using the ln command with the -s option or by invoking the symlink function. The path1 parameter of symlink contains the string that will be the contents of the link, and path2 gives the pathname of the link. That is, path2 is the newly created link and path1 is what the new link points to.

  SYNOPSIS  #include <unistd.h>    int symlink(const char *path1, const char *path2);  POSIX

If successful, symlink returns 0. If unsuccessful, symlink returns “1 and sets errno . The following table lists the mandatory errors for symlink .

`errno`	cause
`EACCES`	search permission on a component of the path prefix of `path2` is denied, or link requires writing in a directory with write permission denied
`EEXIST`	`path2` names an existing file or symbolic link
`EIO`	an I/O error occurred while reading from or writing to the file system
`ELOOP`	a loop exists in resolution of `path2`
`ENAMETOOLONG`	the length of `path2` exceeds `PATH_MAX` , or a pathname component is longer than `NAME_MAX` or the length `path1` is longer than `SYMLINK_MAX`
`ENOENT`	a component of `path2` does not name an existing file, or `path2` is an empty string
`ENOSPC`	directory to contain the link cannot be extended, or the file system is out of resources
`ENOTDIR`	a component of the path prefix for `path2` is not a directory
`EROFS`	the new symbolic link would reside on a read-only file system

Example 5.19

Starting with the situation shown in Figure 5.4, the following command creates the symbolic link /dirB/name2 , as shown in Figure 5.8.

 ln -s /dirA/name1 /dirB/name2

Figure 5.8. Ordinary file with a symbolic link to it.

graphics/05fig08.gif

Example 5.20

The following code segment performs the same action as the ln -s of Example 5.19.

 if (symlink("/dirA/name1", "/dirB/name2") == -1)    perror("Failed to create symbolic link in /dirB");

Unlike Exercise 5.17, the ln command of Example 5.19 and the symlink function of Example 5.20 use a new inode, in this case 13579, for the symbolic link. Inodes contain information about the type of file they represent (i.e., ordinary, directory, special, or symbolic link), so inode 13579 contains information indicating that it is a symbolic link. The symbolic link requires at least one data block. In this case, block 15213 is used. The data block contains the name of the file that /dirB/name2 is linked to, in this case, /dirA/name1 . The name may be fully qualified as in this example, or it may be relative to its own directory.

Exercise 5.21

Suppose that /dirA/name1 is an ordinary file and /dirB/name2 is a symbolic link to /dirA/name1 , as in Figure 5.8. How are the files /dirB/name2 and /dirA/name1 related after the sequence of operations described in Exercise 5.17?

Answer:

/dirA/name1 now refers to a different inode, but /dirB/name2 references the name dirA/name1 , so they still refer to the same file, as shown in Figure 5.9. The link count in the inode counts only hard links, not symbolic links. When the editor unlinks /dirA/name1 , the operating system deletes the file with inode 12345. If other editors try to edit /dirB/name2 in the interval during which /dirA/name1 is unlinked but not yet created, they get an error.

Figure 5.9. Situation after editing a file that has a symbolic link.

graphics/05fig09.gif

Exercise 5.22

How can the sequence of operations in Exercise 5.17 be modified so that /dirB/name2 references the new file regardless of whether this was a hard link or a symbolic link?

Answer:

The following sequence of operations can be used.

Open the file /dirA/name1 .

Read the entire file into memory.

Close /dirA/name1 .

Modify the memory image of the file.

Open the file /dirA/name1 with the O_WRONLY and O_TRUNC flags.

Write the contents of memory to the file.

Close /dirA/name1 .

When the editor opens the file the second time, the same inode is used but the contents are deleted. The file size starts at 0. The new file will have the same inode as the old file.

Exercise 5.23

Exercise 5.22 has a possibly fatal flaw: If the application or operating system crashes between the second open and the subsequent write operation, the file is lost. How can this be prevented?

Answer:

Before opening the file for the second time, write the contents of memory to a temporary file. Remove the temporary file after the close of /dirA/name1 is successful. This approach allows the old version of the file to be retrieved if the application crashes. However, a successful return from close does not mean that the file has actually been written to disk, since the operating system buffers this operation. One possibility is to use a function such as fsync after write . The fsync returns only after the pending operations have been written to the physical medium. The fsync function is part of the POSIX:FSC Extension.

Exercise 5.24

Many programs assume that the header files for the X Window System are in /usr/include/X11 , but under Sun's Solaris operating environment these files are in the directory /usr/openwin/share/include/X11 . How can a system administrator deal with the inconsistency?

Answer:

There are several ways to address this problem.

Copy all these files into /usr/include/X11 .
Move all the files into /usr/include/X11 .
Have users modify all programs that contain lines in the following form.
```
 #include <X11/xyz.h> 
```
Replace these lines with the following.
```
 #include, "/usr/openwin/share/include/X11/xyz.h" 
```
Have users modify their makefiles so that compilers look for header files in the following directory.
```
 /usr/openwin/share/include 
```
Create a symbolic link from /usr/include/X11 to the following directory.
```
 /usr/openwin/share/include/X11 
```

All the alternatives except the last have serious drawbacks. If the header files are copied to the directory /usr/include/X11 , then two copies of these files exist. Aside from the additional disk space required, an update might cause these files to be inconsistent. Moving the files (copying them to the directory /usr/include/X11 and then deleting them from /usr/openwin/share/include/X11 ) may interfere with operating system upgrades. Having users modify all their programs or makefiles is unreasonable. Another alternative not mentioned above is to use an environment variable to modify the search path for header files.

Exercise 5.25

Because of a large influx of user mail, the root partition of a server becomes full. What can a system administrator do?

Answer:

Pending mail is usually kept in a directory with a name such as /var/mail or /var/spool/mail , which may be part of the root partition. One possibility is to expand the size of the root partition. This expansion usually requires reinstallation of the operating system. Another possibility is to mount an unused partition on var . If a spare partition is not available, the /var/spool/mail directory can be a symbolic link to any directory in a partition that has sufficient space.

Exercise 5.26

Starting with Figure 5.8, execute the command rm /dirA/name1 . What happens to /dirB/name2 ?

Answer:

This symbolic link still exists, but it is pointing to something that is no longer there. A reference to /dirB/name2 gives an error as if the symbolic link /dirB/name2 does not exist. However, if later a new file named /dirA/name1 is created, the symbolic link then points to that file.

When you reference a file representing a symbolic link by name, does the name refer to the link or to the file that the link references? The answer depends on the function used to reference the file. Some library functions and shell commands automatically follow symbolic links and some do not. For example, the rm command does not follow symbolic links. Applying rm to a symbolic link removes the symbolic link, not what the link references. The ls command does not follow symbolic links by default, but lists properties such as date and size of the link itself. Use the -L option with ls to obtain information about the file that a symbolic link references. Some operations have one version that follows symbolic links (e.g., stat ) and another that does not (e.g., lstat ). Read the man page to determine a particular function's behavior in traversing symbolic links.

Team-FLY