What Type of Filesystem Should We Use?

Team-Fly    

Solaris™ Operating Environment Boot Camp
By David Rhodes, Dominic Butler
Table of Contents
Chapter 6.  The Filesystem and Its Contents


A number of different filesystem types are available, but there isn't usually much choice over which type you are going to use once you know what the filesystem is going to be used for. We will look mainly at disk-based filesystems in this section, but will also have a brief overview of virtual filesystems. (We cover network-based filesystems in Chapter 18, "NFS, DFS, and Autofs.") Virtual filesystems are generally memory-based filesystems, but the CacheFS and TMPFS types do make use of the disk. Table 6.4 lists the filesystem types, showing which class they belong to.

Table 6.4. Filesystem Types

Filesystem Group

Filesystem Type

Description

Disk-Based

s5

This is quite a dated filesystem type and not used too often. It is mentioned here since it is simple and was the standard until UFS took over.

 

UFS

This is currently the standard filesystem type for user-created hard disk-based filesystems. It has taken over from s5 as the default filesystem type on Solaris systems and has many improvements.

 

HSFS

HSFS is the High Sierra and ISO 9660 filesystem. This is a read-only filesystem and is mainly used on CD-ROMs. Solaris HSFS supports extensions to ISO 9660, which, if present, provide UFS filesystem semantics and file types (but obviously read-only).

 

PCFS

This filesystem type is generally only used on floppy disks. It allows the transfer of data between Solaris and DOS/Windows-based PCs.

Network-Based

NFS

NFS is the only network filesystem type available on Solaris. It is used to enable a filesystem created on one server to be mounted onto another server (or client) over the network. See Chapter 18, "NFS, DFS, and Autofs."

Virtual

CACHEFS

The Cache filesystem can be used to improve performance of remote filesystems or some slow devices such as CD-ROM drives.

 

TMPFS

The temporary filesystem type is the default filesystem type for the /tmp filesystem on Solaris systems. It uses the area of disk allocated to swap, but disk reads and writes are performed directly to and from memory so any process that uses temporary files will be sped up.

 

LOFS

This is the loop-back filesystem. It is used to allow an existing filesystem or directory tree to be mounted in a different part of the filesystem. Thus any file contained on it can be referred to by its real path and its virtual path.

 

PROCFS

The process filesystem type is specific to the /proc directory. It is memory-resident and contains an entry for each running process.

Since they are used rarely and in rather specific circumstances, we will not be looking any further at the virtual filesystem types CACHEFS and LOFS. Creating HSFS filesystems is also beyond the scope of this book.

The early filesystem types were fairly basic and had various drawbacks, so over time new filesystem types were designed to get around these drawbacks and provide additional features. These tended to grow in complexity, but this doesn't need to worry us too much because at the end of the day they all perform the same basic function, which is to provide a way of storing files and directories on disk with a consistent user interface.

The current Solaris default filesystem type is the UNIX filesystem (UFS), and this is the recommended one to use. However, we will look at the older System V (s5) filesystem type in detail first. This is because it is more basic and therefore the concepts should be easier to understand. The UFS type was developed as an improvement over the s5 type, but has the same fundamental principles.

System V Filesystem Type

The s5 filesystem is created on a disk partition and is split into four parts. The smallest two are one block in size each and are the boot block and the superblock. The boot block contains information about which blocks in the other two parts are free to use. Of the remaining two parts, the largest contains the actual data being stored in the files and directories and the other contains some things we will look at in more detail later called "inodes."

The map of a typical s5 filesystem is as follows:

Boot Block

Superblock

Inodes

Data Storage

The data area is split into data blocks, the size of which are defined when the filesystem is created, but s5-type filesystems tend to have a block size of 512 bytes. The size of the data area will then govern how many data blocks can fit onto it, and since each nonempty file will use up at least one data block, this also determines how many files will fit. All inodes are the same size so the number of inodes, which is also specified when creating the filesystem, determines the space taken up by the inode area. Each file, or directory, has a single inode associated with it and takes up at least one data block (apart from empty files). If you have some idea of the likely number of files and their sizes, some optimization of the filesystem, at creation time, is possible. For example, if you know the filesystem will have very few files, then creating the filesystem with a small number of inodes will allow more space for file data. On the other hand, if you run out of inodes it will not be possible to create any more files in that filesystem, no matter how much space is free in the data area.

The term "inode" stands for "information node." Each inode holds information about a single file. We will be looking at inodes in more detail later in this chapter.

The superblock contains a list of free inodes and free data blocks, so if it becomes corrupt this type of filesystem can become unusable. This was one of the reasons that more robust and complex filesystem types were developed. The most popular filesystem type to supercede s5 was called the UFS type. This was based on the BSD Fast Filesystem that was provided with BSD 4.3.

Recent versions of Solaris do not provide support for the s5 filesystem type by default; there should be no normal circumstance where you would need to create a filesystem of this type. It has been included here since it is simple and therefore useful in demonstrating how a filesystem works.

The s5 filesystem type was successfully used for many years, but it did have a number of drawbacks. These include the following:

  • As the filesystem increases in size, performance decreases.

  • The directory format is fixed (14 bytes for each filename and 2 bytes for the inode number, which resulted in a maximum filename size of 14 characters).

  • There can be a maximum of 65,536 inodes, which limits both the number of files the filesystem can have and its maximum size.

  • There is only one superblock, so if it becomes corrupt the filesystem is unusable.

UFS Type

The UFS type provides many advantages over the s5 type, but it is also more complex. It is the default filesystem type on Solaris, so you will need to become familiar with it.

UFS types have the following features:

  • State flag settings that reduce the need to check the filesystem on startup, leading to a quicker boot-time.

  • Extended Fundamental Types (EFTs). Basically, this allows more users, groups, and devices to exist as 32-bit numbers are used to represent them.

  • Large filesystems. A UFS can be as large as 1 terabyte, though this may not be achievable as a UFS cannot span multiple disks, unless an underlying volume management tool is used, and hard disks may not reach 1 terabyte in size for a while yet.

  • Larger files. By default, an individual file can be created in excess of 2 GB. However, if you specify the "nolargefiles" flag when the filesystem is created, a limit of 2 GB will be enforced.

  • Backup superblocks. When a UFS is created, a number of backup superblocks are created so if the primary superblock becomes corrupt it can be recovered from one of the backups.

The UFS type is similar to the s5 type in that it contains the same four types of blocks. However, the internals are more complex (see Table 6.5).

Table 6.5. Four Types of Blocks in a UFS

Block Type

Description

Boot

This area contains information that is used when the system boots.

Superblock

The superblock holds information about the filesystem itself. This includes:

  • Filesystem state flag

  • Filesystem name

  • Filesystem size

  • Number of inodes

  • Date and time of last update

  • Cylinder group size

  • Number of data blocks in each cylinder group

  • Directory that the filesystem was last mounted on

  • Free block count

  • Free inode count

Inode

This section holds all the inodes. An inode holds all the information about a file apart from its name.

Data

This section holds all the data contained in the files in this filesystem.

An s5-type filesystem has a fixed block size for the data storage area that is defined at filesystem creation. A UFS filesystem also has its data storage area block size defined when it is created, but to allow more flexibility there is a logical block size and a fragmentation size. The defaults are for an 8 KB logical block size and a 1 KB fragmentation size. The reason for this is to reduce disk fragmentation (holes caused by partially allocated blocks). As a file is written, it will be allocated logical blocks at first, then it will be allocated fragments. Very small files will just have fragments allocated. Both the logical block size and fragment size can be set when the filesystem is created, but they cannot be changed at a later time without recreating the filesystem. In general, the default 8 KB logical block size is a good compromise for most situations; however, if you have a large filesystem that you know will contain large files you could improve efficiency by choosing a larger logical block size. Likewise, if you know the filesystem will mostly contain small files you could choose a smaller logical block size. When it comes to choosing a fragment size, the general rule is that a large fragment size will increase the speed of file access but reduce the space optimization, and a smaller fragment size will reduce fragmentation but cause file access to be slightly slower. The default size of 1 KB should be suitable for most general-purpose filesystems, but the rule of thumb is the same as for logical blocks. Increase the fragment size for filesystems containing mostly large files and decrease it for filesystems containing mostly small files. You can use the command quot -c to obtain information about the distribution of files by block size.

The logical disk size has nothing to do with the physical block size. The physical block size is the smallest chunk of data that can be transferred by the disk controller; this is usually 512 bytes. The logical block size is the chunk size that the UNIX kernel will use.

The UFS filesystem type improved on all the major weaknesses of the s5 type:

  • UFS has multiple superblocks, so the filesytem can still be mounted if the main superblock is corrupt.

  • Filenames can now be up to 255 characters and the inode number is now 4 bytes.

  • UFS has a larger block size (8 KB).

Although the UFS type contains multiple superblocks, they are not all updated together. Only the first superblock is updated online; the backups simply contain static data such as fragment size, block size, filesystem size, etc. If the main superblock becomes corrupt the filesystem checker (fsck) will update the copy to be used by performing an audit of the filesystem.

Prior to Solaris 8, the maximum filesystem block size was equal to the memory page size, but from Solaris 8 onwards this restriction was lifted. The block size represents the unit of transfer between disk and memory.

The UFS type is split into cylinder groups and each cylinder group contains a backup superblock. The backup superblock is positioned at a different offset from the beginning of each cylinder group.

The map of a typical UFS-type filesystem is as follows:

Cylinder Group 0

Boot Block

Superblock

Cylinder Group Map

Inodes

Data Storage

Cylinder Group 1

Data Storage

Superblock Copy

Cylinder Group Map

Inodes

Data Storage

Cylinder Group 2 (etc.)

Data Storage

Superblock Copy

Cylinder Group Map

Inodes

File Data

In addition to the superblock copy, each cylinder group contains a cylinder group map. This contains information about which data blocks are free along with information about the fragments that will help prevent fragmentation from affecting disk performance.

When data is written to a UFS filesystem, it will normally be done with the filesystem in time optimization mode. This means the filesystem is trying to be as quick as possible and doesn't care too much if it wastes a bit of space here or there. However, once the filesystem becomes near to filling up, the optimization changes to space. Now the filesystem is trying to write data so as little space as possible is wasted, resulting in possible performance degradation. Once the filesystem size falls below the threshold, optimization will move back to time-based.

TMPFS Type

The TMPFS type was devised as a means of speeding up the performance of programs that create temporary files. On a Solaris system, the TMPFS type is the default for the /tmp filesystem. Once mounted, it looks like any other filesystem and can be used in exactly the same way; however, when it is unmounted all files stored in it are lost. The speed improvements are due to the fact that this filesystem type is memory-based. This means that when a file is created under /tmp it exists in memory only, so there is no overhead associated with creating the file on a hard disk. If many temporary files are created and memory begins to fill up, then the temporary files will be transferred to disk. This is enabled because the TMPFS types also make use of the swap area of the disk. When a file is created in /tmp it will never be created on the swap area of the disk, but will always be created in memory. It is only when memory resources become constrained that the temporary file(s) will be moved to swap. Since swap is there for the purpose of storing memory pages (see Chapter 7, "Swap Space"), nothing very different is happening here. In normal circumstances this arrangement works very well; however, there are times when problems can be noticed. The problems are likely to be that /tmp has become full or there is not enough swap space available. If either of these events occurs, it is likely to have been caused by one of the following reasons. Either memory and swap are being overused, causing there to be no room for temporary files; or, there are some very large files in /tmp, which have been moved to the swap area and are not leaving enough free swap for the system. The solution would be to either remove the large files from /tmp or to increase swap. Increasing or adding more swap space is covered in Chapter 7, "Swap Space."

PROCFS Type

Each Solaris system has one filesystem of the virtual filesystem type PROCFS called /proc. The filesystem is created automatically by Solaris and, as it is virtual, is memory-based rather than being stored on disk. It contains a file for each process running with a name equal to the process ID. It is not recommended that processes are killed by removing their /proc entries and there should never be any need to perform housekeeping on this filesystem.

File Descriptor Filesystem (FDFS) Type

A common misconception is that the FDFS type is for floppy disks and that FDFS stands for "Floppy Disk File System." This, however, is not the case. FDFS stands for "File Descriptor File System" and it is actually used by Solaris to allocate file descriptors to setuid and setgid shell scripts. It is a common mistake for administrators to comment the /dev/fd filesystem out of the vfstab file, thinking they do not need it since they do not use floppy disks with their system. If they don't have any setuid or setgid shell scripts they will not see any problems; but if they ever do create any, they will find they won't work and get an error message along the lines of "/dev/fd/3: cannot open."


    Team-Fly    
    Top
     



    Solaris Operating Environment Boot Camp
    Solaris Operating Environment Boot Camp
    ISBN: 0130342874
    EAN: 2147483647
    Year: 2002
    Pages: 301

    flylib.com © 2008-2017.
    If you may any questions please contact us: flylib@qtcs.net