7.8 Disk definitions

 < Day Day Up > 

7.8 Disk definitions

In this section, we show you how to define disks with the NSD server (see 4.2.2, "GPFS Network Shared Disk considerations" on page 81 for information on disk access models on GPFS cluster).


GPFS does not support using different disk access models within the same nodeset.

7.8.1 GPFS nodeset with NSD network attached servers

A nodeset with NSD network attached servers means that all access to the disks and replication will be through one or two storage attached servers (also known as storage node). If your cluster has an internal network segment, this segment will be used for this purpose.

As mentioned in 4.2.2, "GPFS Network Shared Disk considerations" on page 81, NSD network attached disks are connected to one or two storage attached servers only. If a disk is defined with one storage attached server only, and the server fails, the disks would become unavailable to GPFS. If the disk is defined with two NSD network attached servers, then GPFS automatically transfers the I/O requests to the backup server.

Creating Network Shared Disks (NSDs)

You will need to create a descriptor file before creating your NSDs. This file should contain information about each disk that will be a NSD, and should have the following syntax:




The real device name of the external storage partition (such as /dev/sde1).


The host name of the server that the disk is attached to; Remember you must always use the node names defined in the cluster definitions.


The server where the secondary disk attachment is connected.


The kind of information should be stored in this disk. The valid values are data, metadata, and dataAndMetadata (default).


An integer value (0 to 4000) that identifies the failure group to which this disk belongs. All disks with a common point of failure must belong to the same failure group. The value -1 indicates that the disk has no common point of failure with any other disk in the file system. GPFS uses the failure group information to assure that no two replicas of data or metadata are placed in the same group and thereby become unavailable due to a single failure. When this field is not specified, GPFS assigns a failure group (higher than 4000) automatically to each disk.

Example 7-25 shows a descriptor file named /tmp/descfile, which contains NSD information defined in our cluster.

Example 7-25: /tmp/descfile file

start example
 [root@storage001 root]# cat /tmp/descfile /dev/sdd1:storage001-myri0.cluster.com::dataAndMetadata:1 /dev/sde1:storage001-myri0.cluster.com::dataAndMetadata:1 /dev/sdf1:storage001-myri0.cluster.com::dataAndMetadata:1 [root@storage001 root]# 
end example

Now we can create the Network Shared Disks by using the mmcrnsd command, as shown in Example 7-26.

Example 7-26: mmcrnsd command

start example
 [root@storage001 root]# mmcrnsd -F /tmp/descfile mmcrnsd: Propagating the changes to all affected nodes. This is an asynchronous process. [root@storage001 root]# 
end example

After successfully creating the NSD for GPFS cluster, mmcrnsd will comment the original disk device and put the GPFS assigned global name for that disk device at the following line. Example 7-27 shows the modification that was made by the mmcrnsd command.

Example 7-27: /tmp/descfile (modified)

start example
 [root@storage001 root]# cat /tmp/descfile # /dev/sdd1:storage001-myri0.cluster.com::dataAndMetadata:1 gpfs1nsd:::dataAndMetadata:1 # /dev/sde1:storage001-myri0.cluster.com::dataAndMetadata:1 gpfs2nsd:::dataAndMetadata:1 # /dev/sdf1:storage001-myri0.cluster.com::dataAndMetadata:1 gpfs3nsd:::dataAndMetadata:1 [root@storage001 root]# 
end example

Sometimes you are using a disk to create a new NSD that already contained an NSD that has been terminated. In this situation, mmcrnsd may complain that the disk is already an NSD. Example 7-28 shows an example of the error message.

Example 7-28: Error message in mmcrnsd output

start example
 [root@node001 root ] # mmcrnsd -F /tmp/descfile mmcrnsd:Disk descriptor /dev/sde1:node001::dataAndMetadata:1 refers to an existing NSD [root@storage001 root]# 
end example

In this case, if you are sure that the disk is not an in-use NSD, you can override the checking by using the -v no option. For example, the default value is -v yes to verify all devices. See Example 7-29 for details.

Example 7-29: -v no option

start example
 [root@node001 root]# mmcrnsd -F /tmp/descfile -v no mmcrnsd: Propagating the changes to all affected nodes. This is an asynchronous process. [root@storage001 root]# 
end example

You can see the new device names by using the mmlsnsd command. Example 7-30 shows the output of the mmlsnsd command.

Example 7-30: mmlsnsd command

start example
 [root@storage001 root]# mmlsnsd  File system   NSD name     Primary node             Backup node ---------------------------------------------------------------------------  (free disk)   gpfs1nsd     storage001-myri0.cluster.com  (free disk)   gpfs2nsd     storage001-myri0.cluster.com  (free disk)   gpfs3nsd     storage001-myri0.cluster.com [root@storage001 root]# 
end example

You can also use the -m or -M parameter in the mmlsnsd command to see the mapping between the specified global NSD disk names and their local device names. Example 7-31 shows the output of the mmlsnsd -m command.

Example 7-31: mmlsnsd -m output

start example
 [root@storage001 root]# mmlsnsd -m  NSD name     PVID               Device        Node name                Remarks -------------------------------------------------------------------------------------  gpfs1nsd     0A00038D3DB70FE0   /dev/sdd1       storage001-myri0.cluster.com primary node  gpfs2nsd     0A00038D3DB70FE1   /dev/sde1       storage001-myri0.cluster.com primary node  gpfs3nsd     0A00038D3DB70FE2   /dev/sdf1       storage001-myri0.cluster.com primary node [root@storage001 root]# 
end example

Creating the GPFS file system

Once you have your NSDs ready, you can create the GPFS file system. In order to create the file system, you will use the mmcrfs command, where you must define the following attributes in this order:

  1. The mount point.

  2. The name of the device for the file system.

  3. The descriptor file (-F).

  4. The name of the nodeset the file system will reside on (-C) if you defined a nodeset name when creating the cluster.

The mmcrfs command will format the NSDs and get them ready to mount the file system, as well as adding an entry in the /etc/fstab file for the new file system and its mounting point.

Some of the optional parameters are:

-A [yes|no]

Auto-mount the file system. The default value is yes.


Block size for the file system. Default value is 256 KB, and can be changed to 16 KB, 64 KB, 512 KB, or 1024 KB. If you plan to have a file system with block size of 512 KB or 1024 KB, you must also set the value of the maxblocksize nodeset parameter using the mmchconfig command.


Maximum metadata replicas (maxDataReplicas). The default value is 1 and might be changed to 2.


Default data replicas. The default value is 1 and valid values are 1 and 2; This factor cannot be larger than maxDataReplicas.


Maximum data replicas. The default value is 1 and another valid value is 2.


Default metadata replicas. The default value is 1 and another valid value is 2. This factor cannot be larger than maxMetadataReplicas.


Estimated number of nodes to mount the file system. The default value is 32 and it is used to estimate the size of the data structure for the file system.

Some of the information above must be defined during the creation of the file system and cannot not be changed later. These parameters are:

  • Block size

  • maxDataReplicas

  • maxMetadataReplicas

  • NumNodes

The rest of the file system parameters can be changed with the mmchfs command.

When creating our GPFS file systems in our lab environment, we used the default settings of block size value, number of nodes to mount the file system on, maxDataReplicas, and maxMetadataReplicas, which is 1, as shown in Example 7-32.

Example 7-32: Create NSD file system

start example
 [root@storage001 root]# mmcrfs /gpfs gpfs0 -F /tmp/descfile -A yes The following disks of gpfs0 will be formatted on node storage001.cluster.com:     gpfs1nsd: size 71007268 KB     gpfs2nsd: size 71007268 KB     gpfs3nsd: size 71007268 KB Formatting file system ... Creating Inode File   19 % complete on Wed Oct 23 16:24:14 2002   39 % complete on Wed Oct 23 16:24:19 2002   59 % complete on Wed Oct 23 16:24:24 2002   78 % complete on Wed Oct 23 16:24:29 2002   98 % complete on Wed Oct 23 16:24:34 2002  100 % complete on Wed Oct 23 16:24:35 2002 Creating Allocation Maps Clearing Inode Allocation Map Clearing Block Allocation Map Flushing Allocation Maps Completed creation of file system /dev/gpfs0. mmcrfs: Propagating the changes to all affected nodes. This is an asynchronous process. [root@storage001 root]# 
end example

Sometimes you may receive an error message when creating a file system using NSDs that were in use for other file systems, as shown in Example 7-33. If you are sure that the disks are not in use anymore, you can override the verification by issuing the mmcrfs command with the -v no option. The output would be the same as the previous example.

Example 7-33: Error message with mmcrfs command

start example
 [root@storage001 root]# mmcrfs /gpfs gpfs0 -F /tmp/descfile -A yes mmcrfs: There is already an existing file system using gpfs0 [root@storage001 root]# 
end example

After creating the file system, you can run mmlsfs command to display your file system attributes. Example 7-34 on page 218 shows the output of the mmlsfs command.

Example 7-34: mmlsfs command output

start example
 [root@storage001 root]# mmlsfs gpfs0 flag value          description ---- -------------- -----------------------------------------------------  -s  roundRobin     Stripe method  -f  8192           Minimum fragment size in bytes  -i  512            Inode size in bytes  -I  16384          Indirect block size in bytes  -m  1              Default number of metadata replicas  -M  1              Maximum number of metadata replicas  -r  1              Default number of data replicas  -R  1              Maximum number of data replicas  -D posix           Do DENY WRITE/ALL locks block NFS writes(cifs) or not(posix)?  -a  1048576        Estimated average file size  -n  32             Estimated number of nodes that will mount file system  -B  262144         Block size  -Q  none           Quotas enforced      none           Default quotas enabled  -F  104448         Maximum number of inodes  -V  6.00           File system version. Highest supported version: 6.00  -d  gpfs1nsd  Disks in file system  -A  yes            Automatic mount option  -C  1              GPFS nodeset identifier  -E  no             Exact mtime default mount option  -S  no             Suppress atime default mount option  -o  none           Additional mount options [root@storage001 root]# 
end example

You may also run mmlsdisk to display the current configuration and state of the disks in a file system. Example 7-35 shows the output of the mmlsdisk command

Example 7-35: mmlsdisk command

start example
 [root@storage001 root]# mmlsdisk gpfs0 disk         driver   sector failure holds    holds name         type       size   group metadata data  status        availability ------------ -------- ------ ------- -------- ----- ------------- ------------ gpfs1nsd     nsd         512       1 yes      yes   ready         up gpfs2nsd     nsd         512       1 yes      yes   ready         up gpfs3nsd     nsd         512       1 yes      yes   ready         up [root@storage001 root]# 
end example

After creating the file system, GPFS will add a new file system in /etc/fstab, as show in Example 7-36 on page 219.

Example 7-36: /etc/fstab file

start example
 [root@storage001 root]# less /etc/fstab ... /dev/gpfs0              /gpfs                      gpfs dev=/dev/gpfs0,autostart 0 0 ... [root@storage001 root]# 
end example

Mount GPFS file system

The newly created GPFS file system will not automatically be mounted when you just installed GPFS cluster. To mount GPFS file system in all nodes after creating GPFS cluster, go to the management node and run:

 # dsh -av mount /gpfs 

Unless you use the -A no parameter with the mmcrfs command, your GPFS file system will be mounted automatically every time you start GPFS.


When trying to mount the GPFS file system in our ITSO lab environment using the mount /dev/gpfs / gpfs command, we received several kernel error messages. It may have been caused by the fact that the system does not know the file system type GPFS uses (GPFS file system).

7.8.2 GPFS nodeset with direct attached disks

The creation of the disks in an environment with direct attached disks is quite similar to the steps described for the environment with NSD servers in 7.8.1, "GPFS nodeset with NSD network attached servers" on page 213. The differences relate to how the disks will be accessed.

Defining disks

In this case, the disks will not be attached to one server only, but all disks will have a direct connection to all of the nodes through the Fibre Channel switch. Therefore, there will be no need to specify primary or secondary servers for any of the disks, and the disk description file will have the second, third, and fifth fields specified in a different way.

The primary and secondary servers fields must be left null, and the last field must indicate that there is no common point of failure with any other disk in the nodeset. This can be done by specifying a failure group of -1, as in Example 7-37 on page 220.

Example 7-37: Failure group of -1

start example
 [root@storage /tmp]# cat disk_def /dev/sda:::dataAndMetadata:-1 /dev/sdb:::dataAndMetadata:-1 /dev/sdc:::dataAndMetadata:-1 /dev/sdd:::dataAndMetadata:-1 /dev/sde:::dataAndMetadata:-1 [root@storage /tmp]# 
end example

After defining the disks you can verify them using the mmlsnsd command. This option shows all the disks for all the nodes, as in Example 7-38.

Example 7-38: mmlsnsd command

start example
 [root@node1 /root]# mmlsnsd -M  NSD name     PVID               Device           Node name              Remarks ---------------------------------------------------------------------------------------  gpfs1nsd     C0A800E93BE1DAF6   /dev/sda         node1              directly attached  gpfs1nsd     C0A800E93BE1DAF6   /dev/sdb         node2              directly attached  gpfs1nsd     C0A800E93BE1DAF6   /dev/sdb         node3              directly attached  gpfs2nsd     C0A800E93BE1DAF7   /dev/sdb         node1              directly attached  gpfs2nsd     C0A800E93BE1DAF7   /dev/sdc         node2              directly attached  gpfs2nsd     C0A800E93BE1DAF7   /dev/sdc         node3              directly attached  gpfs3nsd     C0A800E93BE1DAF8   /dev/sdc         node1              directly attached  gpfs3nsd     C0A800E93BE1DAF8   /dev/sdd         node2              directly attached  gpfs3nsd     C0A800E93BE1DAF8   /dev/sdd         node3              directly attached  gpfs4nsd     C0A800E93BFA7D86   /dev/sdd         node1              directly attached  gpfs4nsd     C0A800E93BFA7D86   /dev/sde         node2              directly attached  gpfs4nsd     C0A800E93BFA7D86   /dev/sde         node3              directly attached  gpfs5nsd     C0A800E93BFA7D87   /dev/sde         node1              directly attached  gpfs5nsd     C0A800E93BFA7D87   /dev/sdf         node2              directly attached  gpfs5nsd     C0A800E93BFA7D87   /dev/sdf         node3              directly attached [root@node1 /root]# 
end example

It is very important to note that it is not mandatory for the servers to have the same disk structure or amount of internal disks. The names of the disks can be different for each server. For example, in Example 7-38, you can verify that the first disk, with disk ID C0A800E93BE1DAF6, is named /dev/sda for node1, while in node2 its name is /dev/sdb.

GPFS refers to the disks using the disk ID, so you do not have to worry about the /dev/ names for the disks being different among the GPFS nodes.

 < Day Day Up > 

Linux Clustering with CSM and GPFS
Linux Clustering With Csm and Gpfs
ISBN: 073849870X
EAN: 2147483647
Year: 2003
Pages: 123
Authors: IBM Redbooks

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net