The Natural Progression of Troubleshooting

 < Day Day Up > 



First Order of Business: What Problem Are You Troubleshooting?

First things first: We need a problem to troubleshoot. How do you determine what the problem is? You may see a variety of symptoms, but understanding the root problem is key to troubleshooting. Symptoms include the following:

  • Performance is slow.

  • Backups fail.

  • Restores fail.

  • You cannot start a backup on client A.

  • Backups start out fast, but decline.

    And a favorite of every technical support person:

  • Nothing has changed, and all of a sudden, all of your backup jobs won't start.

So are these symptoms or the actual problem? The way to determine that is to drill down further and further into the issue with reasonable depth. If it is a performance problem, is it tape drive performance, client performance, network performance, or disk performance? How do we determine where the problem lies? What about failed backups? Could that be bad media or a bad tape drive? In this chapter, we present an outline for troubleshooting that we have found to be very helpful in our years of experience assisting clients with their troubleshooting.

As mentioned, the key is to determine what exactly is the problem. Here is an example from an actual client that had a couple of issues: First, the client saw that their tape usage had increased considerably, but the amount of data they were backing up had not changed, nor had the backup policies changed. The assumption was that the tape drives were having problems and either were not writing to the end of the tape or had firmware problems that would cause the data to be written improperly. Unfortunately, to add fuel to the fire, one of the tape manufacturers whose tape they were using announced that some of their tapes were flawed because of a servo problem at their plant. The servo problem would either fail the backup job with bad media or write the job successfully until the tape was full or failed with media errors. This led the client to the quick conclusion that it was in fact the tape cartridges themselves, along with possibly the drives, that were causing the problems they had detected.

Upon arriving on-site to have a look at the problem, we asked the administrators several questions, such as 'What has changed in the last three months on the system?' Naturally, the answer was, 'Nothing has changed. It has been fairly static.' The key words are 'fairly static,' which means that something had changed. Because we didn't know exactly what, we began assessing the problem.



 < Day Day Up > 



Implementing Backup and Recovery(c) The Readiness Guide for the Enterprise
Implementing Backup and Recovery: The Readiness Guide for the Enterprise
ISBN: 0471227145
EAN: 2147483647
Year: 2005
Pages: 176

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net