Establishing Maintenance Schedules for SharePoint


Maintaining SharePoint systems isn't an easy task for administrators. They must find time in their firefighting efforts to focus and plan for maintenance on the server systems. When maintenance tasks are commonplace in an environment, they can alleviate many of the common firefighting tasks.

The processes and procedures for maintaining Windows Server 2003 systems can be separated based on the appropriate time to maintain a particular aspect of SharePoint. Some maintenance procedures require daily attention, whereas others may require only yearly checkups. The maintenance processes and procedures that an organization follows depend strictly on the organization; however, the categories described in the following sections and their corresponding procedures are best practices for organizations of all sizes and varying IT infrastructures.

NOTE

These tasks are recommended in addition to those examined earlier in this chapter.


Outlining Daily Maintenance Tasks

Certain maintenance procedures require more attention than others. The procedures that require the most attention are categorized as daily procedures. It is recommended that an administrator take on these procedures each day to ensure system reliability, availability, performance, and security. These procedures are examined in the following three sections.

Checking Overall SharePoint Server Functionality

Although checking the overall server health and functionality may seem redundant or elementary, this procedure is critical to keeping the system environment and users working productively.

Some questions that should be addressed during the checking and verification process are the following:

  • Can users access data in SharePoint document libraries?

  • Are remote users able to access SharePoint via SSL, if configured?

  • Is there an exceptionally long wait to access the portal (that is, longer than normal)?

  • Do SMTP alerts function properly?

  • Are searches properly locating newly created or modified content?

Verifying That Backups Are Successful

To provide a secure and fault-tolerant organization, it is imperative that a successful backup be performed every night. In the event of a server failure, the administrator may be required to perform a restore from tape. Without a backup each night, the IT organization is forced to rely on rebuilding the SharePoint server without the data. Therefore, the administrator should always back up servers so that the IT organization can restore them with minimum downtime in the event of a disaster. Because of the importance of the tape backups, the first priority of the administrator each day needs to be verifying and maintaining the backup sets.

If disaster ever strikes, the administrators want to be confident that a system or entire farm can be recovered as quickly as possible. Successful backup mechanisms are imperative to the recovery operation; recoveries are only as good as the most recent backups.

Although Windows Server 2003's or SharePoint's backup programs do not offer alerting mechanisms for bringing attention to unsuccessful backups, many third-party programs do. In addition, many of these third-party backup programs can send emails or pages if backups are successful or unsuccessful. Although these third-party utilities offer additional functionality, they do not currently offer document-level restore capability. Future iterations of backup software will be able to perform these functions, however.

Monitoring the Event Viewer

The Event Viewer, shown in Figure 18.18, is used to check the system, security, application, and other logs on a local or remote system. These logs are an invaluable source of information regarding the system. The following event logs are present for SharePoint servers running on Windows Server 2003:

  • Security Captures all security-related events being audited on a system. Auditing is turned on by default to record success and failure of security events.

  • Application Stores specific application information. This information includes services and any applications running on the server.

  • System Stores Windows Server 2003specific information.

Figure 18.18. Using the Event Viewing utility.


All Event Viewer events are categorized either as informational, warning, or error. Logs show events of the types shown in Figure 18.19.

Figure 18.19. Displaying log types.


NOTE

Checking these logs often helps to understand them. Some events constantly appear but aren't significant. Events will begin to look familiar, so it will be noticeable when something is new or amiss in event logs. It is for this reason that an intelligent log filter such as MOM 2005 is a welcome addition to a SharePoint environment.


Some best practices for monitoring event logs include

  • Understanding the events being reported.

  • Setting up a database for archived event logs.

  • Archiving event logs frequently.

  • Using an automatic log parsing and alerting tool such as Microsoft Operations Manager.

To simplify monitoring hundreds or thousands of generated events each day, the administrator should use the filtering mechanism provided in the Event Viewer. Although warnings and errors should take priority, the informational events should be reviewed to track what was happening before the problem occurred. After the administrator reviews the informational events, she can filter out the informational events and view only the warnings and errors.

To filter events, do the following:

1.

Start the Event Viewer by choosing Start, All Programs, Administrative Tools, Event Viewer.

2.

Select the log from which you want to filter events.

3.

Right-click the log and select View, Filter.

4.

In the Security Properties window, as shown in Figure 18.19, select the types of events to filter.

5.

Optionally, select the time frame in which the events occurred, the event source, category, event ID, or other options that will narrow down the search. Click OK when finished.

Some warnings and errors are normal because of bandwidth constraints or other environmental issues. The more logs are monitored, the more familiar an administrator should be with the messages and therefore will be able to spot a problem before it affects the user community.

TIP

You may need to increase the size of the log files in the Event Viewer to accommodate an increase in logging activity.


Performing Weekly SharePoint Maintenance

Maintenance procedures that require slightly less attention than daily checking are categorized in a weekly routine and are examined in the following sections.

Checking Disk Space

Disk space is a precious commodity. Although the disk capacity of a Windows Server 2003 system can seem virtually endless, the amount of free space on all drives should be checked daily. Serious problems can occur if there isn't enough disk space.

One of the most common disk space problems occurs on database drives where all SQL SharePoint data is held. Other volumes such as the system drive and partitions with logging data can also quickly fill up.

As mentioned earlier, lack of free disk space can cause a multitude of problems including, but not limited to, the following:

  • SharePoint application failures

  • System crashes

  • Unsuccessful backup jobs

  • Service failures

  • The inability to audit

  • Degradation of performance

To prevent these problems from occurring, administrators should keep the amount of free space to at least 25%.

CAUTION

If needing to free disk space, files and folders should be moved or deleted with caution. System files are automatically protected by Windows Server 2003, but data files are not.


Verifying SharePoint Hardware Components

Hardware components supported by Windows Server 2003 are reliable, but this doesn't mean that they'll always run continuously without failure. Hardware availability is measured in terms of mean time between failures (MTBF) and mean time to repair (MTTR). This includes downtime for both planned and unplanned events. These measurements provided by the manufacturer are good guidelines to follow; however, mechanical parts are bound to fail at one time or another. As a result, hardware should be monitored weekly to ensure efficient operation.

Hardware can be monitored in many different ways. For example, server systems may have internal checks and logging functionality to warn against possible failure, Windows Server 2003's System Monitor may bring light to a hardware failure, and a physical hardware check can help to determine whether the system is about to experience a problem with the hardware.

If a failure occurs or is about to occur on a SharePoint server, having an inventory of spare hardware can significantly improve the chances and timing of recoverability.

Checking system hardware on a weekly basis provides the opportunity to correct the issue before it becomes a problem.

Archiving Event Logs

The three event logs on all servers can be archived manually, or a script can be written to automate the task. You should archive the event logs to a central location for ease of management and retrieval.

The specific amount of time to keep archived log files varies on a per-organization basis. For example, banks or other high-security organizations may be required to keep event logs up to a few years. As a best practice, organizations should keep event logs for at least three months.

TIP

Organizations who deploy Microsoft Operations Manager with SharePoint can take advantage of MOM's capability to automatically archive event log information, providing for a significant improvement to monitoring and reporting of SharePoint.


Performing Monthly Maintenance Tasks

When an understanding of the maintenance required for SharePoint is obtained, it is vital to formalize the procedures into documented steps. A maintenance plan itself can contain information on what tasks to perform at different intervals. It is recommended to perform the tasks examined in the following sections on a monthly basis.

Maintaining File System Integrity

CHKDSK scans for file system integrity and can check for lost clusters, cross-linked files, and more. If Windows Server 2003 senses a problem, it runs CHKDSK automatically at startup.

Administrators can maintain FAT, FAT32, and NTFS file system integrity by running CHKDSK once a month. To run CHKDSK, do the following:

1.

At the command prompt, change to the partition that you want to check.

2.

Type CHKDSK without any parameters to check only for file system errors.

3.

If any errors are found, run the CHKDSK utility with the /f parameter to attempt to correct the errors found.

Testing the UPS Battery

An uninterruptible power supply (UPS) can be used to protect the system or group of systems from power failures (such as spikes and surges) and keep the system running long enough after a power outage so that an administrator can gracefully shut down the system. It is recommended that a SharePoint administrator follow the UPS guidelines provided by the manufacturer at least once a month. Also, monthly scheduled battery tests should be performed.

Validating Backups

Once a month, an administrator should validate backups by restoring the backups to a server located in a lab environment. This is in addition to verifying that backups were successful from log files or the backup program's management interface. A restore gives the administrator the opportunity to verify the backups and to practice the restore procedures that would be used when recovering the server during a real disaster. In addition, this procedure tests the state of the backup media to ensure that they are in working order and builds administrator confidence for recovering from a true disaster.

Updating Documentation

An integral part of managing and maintaining any IT environment is to document the network infrastructure and procedures. The following are just a few of the documents you should consider having on hand:

  • SharePoint Server build guides

  • Disaster recovery guides and procedures

  • Maintenance checklists

  • Configuration settings

  • Change control logs

  • Historical performance data

  • Special user rights assignments

  • SharePoint site configuration settings

  • Special application settings

As systems and services are built and procedures are ascertained, document these facts to reduce learning curves, administration, and maintenance.

It is not only important to adequately document the IT environment, but it's also often even more important to keep those documents up-to-date. Otherwise, documents can quickly become outdated as the environment, processes, and procedures change as the business changes.

Performing Quarterly Maintenance Tasks

As the name implies, quarterly maintenance is performed four times a year. Areas to maintain and manage on a quarterly basis are typically fairly self-sufficient and self-sustaining. Infrequent maintenance is required to keep the system healthy. This doesn't mean, however, that the tasks are simple or that they aren't as critical as those tasks that require more frequent maintenance.

Checking Storage Limits

Storage capacity on all volumes should be checked to ensure that all volumes have ample free space. Keep approximately 25% free space on all volumes.

Running low or completely out of disk space creates unnecessary risk for any system. Services can fail, applications can stop responding, and systems can even crash if there isn't plenty of disk space.

Keeping SQL Database disk space consumption to a minimum can be accomplished through a combination of limiting document library versioning and/or implementing site quotas.

Changing Administrator Passwords

Administrator passwords should, at a minimum, be changed every quarter (90 days). Changing these passwords strengthens security measures so that systems can't easily be compromised. In addition to changing passwords, other password requirements such as password age, history, length, and strength should be reviewed.

Summary of Maintenance Tasks and Recommendations

Table 18.1 summarizes some of the maintenance tasks and recommendations examined in this chapter.

Table 18.1. SharePoint Portal Server 2003 Maintenance Tasks

Daily

Weekly

Monthly

Quarterly

Tasks and Servers Accessed for Task Completion

X

   

Check overall server functionality (SharePoint access, document check-in, and so on).

X

   

Verify that backups are successful.

X

   

Monitor Event Viewer.

 

X

  

Check disk space.

 

X

  

Verify hardware.

 

X

  

Archive event logs.

 

X

  

Check SharePoint diagnostic logs.

 

X

  

Test the UPS.

  

X

 

Run the SQL Maintenance Plan Wizard.

  

X

 

Run CHKDSK.

  

X

 

Validate backups and restores.

  

X

 

Update documentation.

   

X

Check disk space.

   

X

Change administrator passwords.





Microsoft SharePoint 2003 Unleashed
Microsoft SharePoint 2003 Unleashed (2nd Edition) (Unleashed)
ISBN: 0672328038
EAN: 2147483647
Year: 2005
Pages: 288

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net