Fragmentation Adversely Affects Systems Stability

 < Day Day Up > 



Windows has had a reputation in the past for crashing more than other operating systems (note that this situation is far less prevalent with XP and Windows 2000). It may well be that part of the problem has been user ignorance of fragmentation. While it is broadly accepted that fragmentation negatively impacts system performance, what is not so well understood is that fragmentation exerts a severe toll in terms of system unreliability and downtime. Simply put, a system is never in a more stable state than when all files are in contiguous form. The moment a file is broken into pieces, the door to a host of stability issues opens. A badly fragmented system will not only slow down considerably, but it will also be slow to reboot (or will not reboot at all in extreme cases), will suffer from far more crashes, will be slow to back up, and will cause file corruption, data loss, system crashes and hangs, program errors, memory issues, and even hard drive failures.

These consequences of fragmentation have been substantiated by hundreds of IT users of defragmentation products who have been interviewed over the past decade. About half or more have commented that their systems now rarely crash compared with before the defragmentation utility was installed. They attribute this improvement to well-maintained files and file systems, which in turn influence system stability positively.

Countless documents available within the Microsoft Knowledge Base (KB) address the subject of how fragmentation severely and insidiously affects system uptime and stability. Failure to understand the source of these problems forces IT staff to troubleshoot in the wrong areas which often leads to actions such as reinstalling software, re-imaging of hard drives, and expensive replacement of hardware, as well as an overworked help desk. These problems cut into company profits and lead to unacceptable levels of downtime by forcing system administrators to work reactively rather than proactively. At the root of many of these problems lies fragmentation. Fragmentation erodes stability and reliability in seven main areas, as follows.

Fragmentation and Boot Time

As mentioned earlier, fragmentation is a major factor in slow boot times. Many cases are on record of machines taking twenty or thirty minutes to reboot that previously took only a minute or two. This situation can deteriorate to the point where a machine will not boot up at all. This situation affects not only Windows NT, but also Windows 2000 and Windows XP. According to Microsoft, this issue can occur when the NTFS bootsector code contained in logical sector zero of an NTFS volume is unable to locate and load NTLDR1 into memory because the Master File Table (MFT) is highly fragmented (see Microsoft KB article Q228734, http://support.microsoft.com). Why does this occur? The NTFS bootsector code locates and loads NTLDR into memory. This involves reading the volume's MFT to obtain the root directory. When the MFT is highly fragmented, pieces of the MFT and other metadata that must be read in order to locate the NTLDR may fall outside the areas of the disk that can be read by the INT 13 BIOS routine. Thus, the system fails to boot up.

Other Microsoft Knowledge Base articles discuss additional manifestations of similar problems. Microsoft KB article Q155892 discusses the situation when the allocation for NTLDR's $DATA attribute becomes so fragmented that the whole $DATA attribute is no longer in the base FRS (file record segment). Microsoft KB article Q176968 explains that when you attempt to boot with an NTFS system partition, the computer may hang after the power-on self test (POST), and you may receive an error message stating that a kernel file is missing. This can occur if the NTFS disk structure data contained in the MFT is fragmented (as described above), preventing boot up. It was thought that this bug had been eradicated in Windows 2000 and XP, as an updated bootsector code and NTLDR were made available for Windows 2000 with the intention of removing its susceptibility to this situation. Similarly, Windows XP includes "prefetching" of boot files and automatically defragments the boot sector to accelerate startup. However, neither of these fixes has eliminated the problem of systems failing to reboot periodically.[1]

Fragmentation and Slow Back Up Times/Aborted Backups

Backup windows these days are shrinking. While IT used to have twelve or more hours available for backup and maintenance tasks, or even all weekend, they are now expected to perform such tasks in a shorter period. Yet, at the same time, the amount of data to be backed up is growing exponentially. This combination of circumstances leads to two problems. System administrators report that lengthy backups mean they do not have time for other routine maintenance actions. Backups have to be aborted when they take up too much time and threaten to encroach on the work day.

What does this have to do with fragmentation? Today's systems take a pounding that leaves most hard disks horribly fragmented within a very short period of time. Programs create and delete large numbers of temporary files, documents are written to disk and deleted continually, and, as a result, drives rapidly become fragmented. Documents are commonly found splintered into hundreds and even thousands of pieces. This adds to system overhead as it takes much longer for the computer to read and write fragmented files than contiguous ones. If fragmentation is not quickly addressed, the condition of the drive deteriorates rapidly (see National Software Testing Labs' white paper, System Performance and File Fragmentation in Windows NT, http://www.execsoft.com).

In addition to slowing systems to a crawl, however, fragmentation multiplies the amount of time required for backup. If all files existed in a contiguous state, backup would occur relatively swiftly; instead, the head must thrash around gathering together numerous fragments before they can be consolidated into one piece and then backed up. The situation is compounded by the fact that most backup software creates a large number of temporary files. This results in even more fragmentation, further slowing the backup process.

This is why many users of enterprise-class defragmenters report backup times shrinking by several hours per night after instituting daily fragmentation of every server and workstation. By consolidating files into one piece before backing them up, a much shorter backup window is required, which allows system administrators to schedule other important maintenance tasks.

The IT analyst group IDC believes that, to maintain optimal system performance, companies need to schedule disk defragmentation on a regular basis for all their servers and workstations; otherwise, files can take 10 to 15 times longer to access, boot time can be tripled, and nightly backups can take hours longer. Similarly, if nightly backups are taking twelve hours (in a fragmented system) instead of four, not enough time may be available to complete a backup. Under these circumstances, IT would be forced to abort the backup before commencement of the work day. Fortunately, regular defragmentation eliminates these backup concerns.

Fragmentation and File Corruption/Data Loss

File corruption and data loss are two factors immediately traceable to fragmentation, as verified during recent tests conducted at the research labs of Executive Software. Both Windows 2000 and Windows XP were tested as follows. Technicians ran a utility designed to fragment an NTFS volume. Even though the drive was only 40 percent full, the files themselves were severely fragmented, resulting in the need for many more MFT records than usual. When they attempted to move a contiguous 72-MB file onto that disk, the result was the corruption of everything on the disk, including the 72-MB file. This occurred on Windows 2000 and on Windows XP.

Why does this occur? The presence of excessive file fragments on a disk makes it difficult for the operating system to function efficiently. When a large file is added, large-scale data corruption results. While nothing currently exists within the Microsoft Knowledge Base regarding this specific subject, documentation is available concerning other fragmentation-related corruption/data loss issues.

Receiving this message is not uncommon:

  • Windows NT could not start because the following file is missing or corrupt:

  • <Winnt_root>\System32\Ntoskrnl.exe.

  • Please re-install a copy of the above file.

This form of corruption/data loss led to the inability to boot because some key files needed for booting the operating system were situated beyond cylinder 1023 on the volume. But, given the CHS (cylinder/head/sector) setup on the machine, the boot sequence could see only the first 7.68 GB of the volume during the initial boot phase. The needed file was situated beyond where the INT 13 BIOS interface could find it. Deleting the first file and replacing it meant that it fell within the first 7.68 GB. Regular defragmentation would keep system files from becoming too spread around the volume, preventing this situation from recurring. (This issue is covered in more detail in Microsoft KB article Q224526, http://support.microsft.com).

Fragmentation and Crashes

Many documented cases of errors and crashes on Windows caused by fragmentation can be found. Types of errors include system hangs, times out, failures to load, failures to save data, and blue screens. In one scenario, for instance, a crash takes place when attempting to run CHKDSK on a highly fragmented drive. According to Microsoft, when you attempt to run CHKDSK/F on a drive that is heavily fragmented or that contains bad clusters, Windows NT version 4.0 may halt with a kernel mode trap screen STOP 0x00000024 in Ntfs.sys (see Microsoft KB article Q160451). Another Microsoft KB article (Q165456) that highlights the reason for a system freezing considers the situation when the NTFS file system driver attempts to perform I/O to a fragmented file and does not correctly clear a required field, causing either a STOP 0xA or a deadlock condition. This causes the process to stop responding. What this means is that fragmentation can slow down I/O to the point where programs and processes cease to function entirely. With files in many pieces, they are unavailable to the system when needed and a crash takes place.

Fragmentation and Errors in Programs

Reports are also fairly frequent of errors in large applications such as Microsoft Outlook or Microsoft Word because the applications and/or associated databases are substantially fragmented. As in the previous section, this is related to the sheer size of such applications and the time it takes to physically gather all the parts in order to load properly. In some cases, fragmentation slows down the loading of applications, sometimes significantly, but in other cases the application times out or freezes. On Microsoft Word 2000, for example, an error message may appear stating that there are too many edits in a document. Users are advised to save their work or face data loss (see Microsoft KB article Q224029). This situation is caused by insufficient disk space on the hard disk containing the Windows Temp folder as well as by fragmented or cross-linked files.

Additionally, Windows 2000 can sometimes hang. During startup, this is related to the system hive file becoming too large due to fragmentation. According to Microsoft KB article Q265509, the system hive file is usually the biggest file that is loaded and is likely to be fragmented because it is modified often. If the system hive file is too fragmented, it is not loaded from an NTFS volume and the computer hangs. After startup, Windows 2000 can also hang due to the inability of the server service to keep up with the demand for network work items that are queued by the network layer of the I/O stream; that is, due to fragmentation, the server service cannot process the requested network I/O items to the hard disk quickly enough before running out of resources.

Compact disc (CD) writers and other media devices also experience problems caused by fragmentation. Such devices require data to be supplied sequentially in a steady stream. If the associated files are fragmented, this data stream is interrupted as the system struggles to gather together various fragments. This interferes with the quality of video playback and leads to a CD write aborting. Regular defragmentation heightens the reliability of such devices. According to Microsoft KB article Q306524 (How To Copy Information to a CD in Windows XP), CD recording may fail intermittently. The document provides several ways to resolve this issue, and the primary step is to defragment the hard disk containing the data destined for the CD.

Fragmentation and RAM Use/Cache Problems

Files often become so fragmented that they seem to take an eternity to be read into the cache. In addition to long delays, this situation can lead to system hangs. Similarly, a fragmented paging file creates system stability challenges. An "out of virtual memory" error message results, for example, on a primary domain controller, and data loss results. According to Microsoft KB article Q215859, the pagefile.sys file is either not large enough or is severely fragmented. This may also cause users to experience problems when they attempt to change their password or gain access to the network. As noted earlier, such memory issues are rooted in the fact that excessive overhead is required to compile files that are scattered around a disk in many pieces. By keeping files consolidated, these memory problems are eliminated.

Fragmentation and Hard Drive Failure

Fragmentation hastens the onset of hard drive failure by increasing the amount of head movement; consequently, regular defragmentation extends drive longevity. The reason for this is simple. When a defragmentation program is run, it attempts to move files, but if it uncovers bad areas on the disk, it directs the user to run CHKDSK. Without the running of a defragmentation utility, these bad sectors may not receive the attention they deserve. Over time, the number of bad sectors snowballs, leading to corruption of the entire drive.

[1]1 NTLDR = NT loader: a program loaded from the boot sector that displays the startup menu and helps the OS load.



 < Day Day Up > 



Server Disk Management in a Windows Enviornment
Server Disk Management in a Windows Enviornment
ISBN: N/A
EAN: N/A
Year: 2003
Pages: 197

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net