8.1 Adding and removing disks from GPFS

< Day Day Up >

Unlike many traditional file systems, GPFS allows disks to be added and removed from the file system, even when it is mounted. In this section, we outline the procedures for working with disks.

8.1.1 Adding a new disk to an existing GPFS file system

For this task, we will use a command that has not been discussed in this redbook before: mmadddisk. This command adds a new disk to a file system and optionally re-balances data onto the new disk.

Before adding the physical disk to the file system it must first be defined as an NSD. We will use mmcrnsd for this, as in 7.8.1, "GPFS nodeset with NSD network attached servers" on page 213.

In Example 8-1, we create an NSD using the second disk from node001.

Example 8-1: Creating an additional NSD with mmcrnsd

 [root@storage001 root]# cat > newdisk.dsc /dev/sdb1:node001-myri0.cluster.com::dataAndMetadata:-1 ^D [root@storage001 root]# mmcrnsd -F newdisk.dsc mmcrnsd: Propagating the changes to all affected nodes. This is an asynchronous process. [root@storage001 root]#

If the disk (/dev/sdb1 in this case) contains an NSD descriptor, the mmcrnsd command will fail. You can verify that the disk is not currently being used as an NSD by issuing the mmlsnsd -m command. After that, you can define the new disk by adding the -v no option to the mmcrnsd command line to disable disk verification.

Example 8-2 shows the use of the mmlsnsd command to verify that the NSD was created correctly. The newly created NSD should show up as (free disk), since it has not yet been assigned to a file system.

Example 8-2: Verifying correct NSD creation with mmlsnsd

 [root@storage001 root]# mmlsnsd File system    NSD name     Primary node             Backup node ---------------------------------------------------------------------------  gpfs0         gpfs2nsd     storage001-myri0.cluster.com (free disk)    gpfs3nsd     node001-myri0.cluster.com [root@storage001 root]#

Once the NSD has been successfully defined, we can use it to enlarge our GPFS file system with mmaddisk. Because GPFS paralyzes read and write operations, simply appending the disk to the file system is inefficient. Data will not be balanced across all the disks, so GPFS will be unable to make optimal use of the new disk. The mmadddisk command can automatically re-balance the data across all the disks through the use of the -r switch. In large file systems, the re-balance can take a long time, so we also supply the asynchronous switch (-a). This will cause the mmadddisk command to return while the re-balance continues in the background.

Note

Although you can still access the file system while it is being re-balanced, certain GPFS metadata commands, including mmdf, cannot be run until the re-balance has completed.

Example 8-3 shows the output of the mmadddisk command.

Example 8-3: Adding a disk to a GPFS file system with mmadddisk

 [root@storage001 root]# mmadddisk gpfs0 -F newdisk.dsc -r -a GPFS: 6027-531 The following disks of gpfs0 will be formatted on node storage001.cluster.com:     gpfs3nsd: size 17767858 KB Extending Allocation Map GPFS: 6027-1503 Completed adding disks to file system gpfs0. mmadddisk: Propagating the changes to all affected nodes. This is an asynchronous process. [root@storage001 root]#

Tip

Although disks can be re-balanced while the file system is running, it will affect the performance. You might want to think about adding the new disk(s) during a period of low activity, or re-balancing the disks at a later time with the mmrestripefs -b command.

Using the mmlsdisk and mmlsnsd command, you can verify that the NSD is now a member of our GPFS file system, as in Example 8-4.

Example 8-4: Verifying the new NSD was added with mmlsdisk and mmlsnsd

 [root@storage001 root]# mmlsdisk gpfs0 disk         driver   sector failure holds    holds name         type       size   group metadata data  status        availability ------------ -------- ------ ------- -------- ----- ------------- ------------ gpfs2nsd     nsd         512      -1 yes      yes   ready         up gpfs3nsd     nsd         512      -1 yes      yes   ready         up [root@storage001 root]# mmlsnsd File system   NSD name     Primary node             Backup node --------------------------------------------------------------------------- gpfs0         gpfs2nsd     storage001-myri0.cluster.com gpfs0         gpfs3nsd     node001-myri0.cluster.com [root@storage001 root]#

You can also verify the capacity of your file system using mmdf, as shown in Example 8-5. Remember that mmdf will not consider replication factors; if you are using full replication, you will need to divide the file system size and free space by two.

Example 8-5: Inspecting file system capacity with mmdf

 [root@storage001 root]# mmdf gpfs0 disk            disk size  failure holds    holds         free KB         free KB name                in KB    group metadata data   in full blocks    in fragments --------------- --------- -------- -------- ----- --------------- --------------- gpfs2nsd        106518944       -1 yes      yes   106454016 (100%)      1112 ( 0%) gpfs3nsd         17767856       -1 yes      yes    17733632 (100%)       656 ( 0%)                 ---------                          -------------- -------------- (total)         124286800                         124187648 (100%)      1768 ( 0%) Inode Information ------------------ Total number of inodes: 104448 Total number of free inodes: 104431 [root@storage001 root]#

8.1.2 Deleting a disk in an active GPFS file system

Although this sounds like a scary thing to do, it is actually perfectly safe; GPFS handles this easily when the system utilization is low. Under load, it may take a significant amount of time.

Removal is accomplished with the GPFS mmdeldisk command, passing the file system name and disk (NSD) you want to delete. As with mmadddisk, you can also specify that the file system should be re-striped by using the -r and -a options to perform the re-stripe in the background.

Important:

The disk to be deleted by the mmdeldisk command must be up and running for this command to succeed; you can verify this by using the mmlsdisk command. If you need to delete a damaged disk, you must use the -p option so it can delete a stopped disk.

Example 8-6 shows the removal of gpfs3nsd that we just added to our file system.

Example 8-6: Removing a disk from GPFS with mmdeldisk

 [root@storage001 root]# mmdeldisk gpfs0 gpfs3nsd -r -a Deleting disks ... GPFS: 6027-589 Scanning file system metadata, phase 1 ...   31 % complete on Thu Nov 24 17:12:55 2002   62 % complete on Thu Nov 24 17:12:58 2002   93 % complete on Thu Nov 24 17:13:01 2002  100 % complete on Thu Nov 24 17:13:02 2002 GPFS: 6027-552 Scan completed successfully. GPFS: 6027-589 Scanning file system metadata, phase 2 ... GPFS: 6027-552 Scan completed successfully. GPFS: 6027-589 Scanning file system metadata, phase 3 ... GPFS: 6027-552 Scan completed successfully. GPFS: 6027-565 Scanning user file metadata ...   84 % complete on Thu Nov 24 17:13:08 2002  100 % complete on Thu Nov 24 17:13:08 2002 GPFS: 6027-552 Scan completed successfully. GPFS: 6027-370 tsdeldisk completed. mmdeldisk: Propagating the changes to all affected nodes. This is an asynchronous process. [root@storage001 root]#

Again, we can use the mmlsdisk and mmlsnsd commands to verify the successful removal of the disks, as in Example 8-7.

Example 8-7: Verifying successful disk removal with mmlsdisk and mmlsnsd

 [root@storage001 root]# mmlsdisk gpfs0 disk         driver   sector failure holds    holds name         type       size   group metadata data  status        availability ------------ -------- ------ ------- -------- ----- ------------- ------------ gpfs2nsd     nsd         512      -1 yes      yes   ready         up # mmlsnsd  File system   NSD name    Primary node              Backup node ---------------------------------------------------------------------------  gpfs0         gpfs2nsd    storage001-myri0.cluster.com  (free disk)   gpfs3nsd    node001-myri0.cluster.com [root@storage001 root]#

Example 8-8 on page 230 shows how we could also have used mmlsnsd -F to show only free NSDs in our nodeset.

Example 8-8: Listing free NSDs with mmlsnsd -F

 [root@storage001 root]# mmlsnsd -F  File system   NSD name     Primary node              Backup node ---------------------------------------------------------------------------  (free disk)   gpfs3nsd     node001-myri0.cluster.com [root@storage001 root]#

8.1.3 Replacing a failing disk in an existing GPFS file system

GPFS allows for a failing disk to be replaced while the file system is up and running by using the mmrpldisk command. Although replacing with different size disks is supported, complications can arise and it should be avoided whenever possible. It is further recommended that you do not change the disk usage (data/metadata) or failure group if possible.

Important:

You cannot use the mmrpldisk command to replace disks that have actually failed; they should be removed with the mmdeldisk -p command. Verify disks are available and up with the mmlsdisk command before attempting this procedure.

As when adding a new disk, the replacement disk must first be defined as an NSD. Example 8-9 shows the definition of the disk with the mmcrnsd command.

Example 8-9: Defining a replacement disk with mmcrnsd

 [root@storage001 tmp]# cat > rpldisk.dsc /dev/sdb1:node002-myri0.cluster.com::dataAndMetadata:-1 ^D [root@storage001 tmp]# mmcrnsd -F rpldisk.dsc -v no mmcrnsd: Propagating the changes to all affected nodes. This is an asynchronous process. [root@storage001 tmp]#

Now we can run, as shown in Example 8-10, the mmrpldisk command to actually perform the replacement. We replace failing disk gpfs3nsd with the newly created NSD.

Example 8-10: Replacing a disk with mmrpldisk

 [root@storage001 tmp]# mmrpldisk gpfs0 gpfs3nsd -F rpldisk.dsc Replacing gpfs3nsd ... GPFS: 6027-531 The following disks of gpfs0 will be formatted on node storage001.cluster.com:     gpfs5nsd: size 17767858 KB Extending Allocation Map GPFS: 6027-1503 Completed adding disks to file system gpfs0. GPFS: 6027-589 Scanning file system metadata, phase 1 ...   66 % complete on Thu Nov 24 17:42:44 2002  100 % complete on Thu Nov 24 17:42:45 2002 GPFS: 6027-552 Scan completed successfully. GPFS: 6027-589 Scanning file system metadata, phase 2 ... GPFS: 6027-552 Scan completed successfully. GPFS: 6027-589 Scanning file system metadata, phase 3 ... GPFS: 6027-552 Scan completed successfully. GPFS: 6027-565 Scanning user file metadata ... GPFS: 6027-552 Scan completed successfully. Done mmrpldisk: Propagating the changes to all affected nodes. This is an asynchronous process. [root@storage001 tmp]#

Note

If you are replacing the failed disk with an identical disk (size, usage, and failure group), no re-balance is required. Otherwise, you may want to run use the mmrestripefs -b command sometime when system is not overloaded.

We can now verify that the failing disk has been removed from the file system, as shown in Example 8-11.

Example 8-11: Using mmlsdisk and mmlsnsd to ensure a disk has been replaced

 [root@storage001 tmp]# mmlsdisk gpfs0 disk         driver   sector failure holds    holds name         type       size   group metadata data  status        availability ------------ -------- ------ ------- -------- ----- ------------- ------------ gpfs2nsd     nsd         512      -1 yes      yes   ready         up gpfs4nsd     nsd         512      -1 yes      yes   ready         up # mmlsnsd -F  File system   NSD name     Primary node             Backup node ---------------------------------------------------------------------------  (free disk)   gpfs3nsd     node001-myri0.cluster.com [root@storage001 tmp]#

< Day Day Up >