Basic Data Challenges

                 

 
Special Edition Using Microsoft SharePoint Portal Server
By Robert  Ferguson

Table of Contents
Chapter  22.   Example Scenario 3 ”Enterprise-Wide Solution


While much of the discussion of this chapter has centered around data and access to data, there exist basic or inherent constraints that drive enterprise-wide SharePoint deployments in one manner or another. Some of these constraints include

  • Sources of data, including sensitive internal documents, subscription service-based data, stock quote feeds, and so on, drive multi-dashboard management requirements.

  • The impact of proxy servers on accessibility and security drives client configuration and support costs.

  • The raw scope of data servicing different needs of the enterprise organization can overwhelm unprepared workspace Coordinators.

  • The variety of data formats, like Word documents, Excel and other spreadsheets, PowerPoint presentations, Adobe PDFs, CorelDraw documents, and more can also overwhelm the workspace Coordinators.

  • Language requirements, and how these drive data stores, search capabilities, and so on, must be addressed.

  • Backing up and safeguarding all of this data!

Sources of Data

Mapping the standard Web Parts, or creating custom Web Parts linking various sources of data, is perhaps the most common method of "integrating" various sources of data into a cohesive portal whole. Four Web Parts ship with the SharePoint Portal by default. These are

  • News

  • Announcements

  • Subscription Summary

  • Quick Links

At Global, the News Web Part displays links or stories of general interest. For example, company- related news articles and stories "hot off the wire" find themselves featured here. Other news services, like press releases, are also displayed here.

The Announcements Web Part is used for company-wide announcements, business unit events, and so on. At Global, the Announcements Web Part for Marketing included a reminder regarding a deadline for the introduction of new-product literature into the sales channel.

The Subscription Summary Web Part affords a summary of the end user's various subscriptions. The inclusion of a "Subscription Notifications" Web Part makes great sense at high-level functional or perhaps corporate-wide Portal implementations ”if a set of search results is found useful, the end user can easily subscribe to the related content, relying on notifications from SharePoint Portal Server as to changes or new relevant documents.

The Quick Links Web Part displays links to other areas of interest, much like a My Favorites" approach. At Global, the Quick Links Web Part in the dashboard site of the Marketing group contained links to just-announced and new-product road maps and associated collateral .

More on Web Parts

Web Parts may be drafted that support collaboration tools like Microsoft NetMeeting, or facilitate high-level services like providing directions and maps to various company locations. As mentioned previously, Global Corporation created a custom Web Part that highlighted often used or especially large documents, including metadata properties. A special Search Web Part was also added to Global's enterprise SharePoint Portal dashboard, facilitating maximum search capability across the organization.

In any case, the point here is that a huge variety of data sources and document types dictate much in the way of management. The opportunity to share enterprise-wide data is both real and compelling, but represents a challenge in terms of information overload, and may even promote pushing out stale and inaccurate information if dashboards are not well maintained .

Creating Custom Web Parts Supporting Enterprise Deployment

To create Web Parts supporting multiple environments, such as those inherent to complex enterprise implementations, the following approach is generally recommended:

  • Leverage relative URLs in all ASP, HTML, and VBScript files ”it makes the transition from Development to Test to Production much easier, not to mention more consistent.

  • For any links within your Web Part, generate your own relative links (refrain from coding absolute URLs to a NetBIOS name ”relative links also support fully qualified domain names both internally and externally).

Security of Data and the Proxy Server

As stated previously, the impact that proxy servers have on the enterprise in terms of accessibility and security drives client configuration and support costs. A proxy server enhances the security of your intranet by preventing unauthorized access by someone on the Internet. A proxy server may enhance SharePoint Portal Server performance by caching recently accessed Web pages and therefore minimizing network traffic/download time, but they also add complexity. The proxy server must now be analyzed under the same constraints and high-availability requirements as those that drove the Portal configuration in the first place.

Proxy server technical ramifications are many:

  • Where the proxy server resides in relation to the other SharePoint servers must be considered ”proxy servers can break down index propagation if the proxy sits between the server responsible for creating/updating the index and dedicated search servers (the proxy server may be configured to allow Windows file share access to get around this, however) .

  • Proxy servers must be configured specifically to support and pass a few critical verbs ”SharePoint Portal Server uses HTTP, DAV, and INVOKE, the last a custom SharePoint Portal Server verb.

  • The proxy setting for Internet Explorer on the desktop or laptop client will impact how the client/dashboard site HTTP-based communication will occur, if at all.

  • Per Microsoft, the dashboard site leverages a unique server-side object called ServerXMLHTTP to make HTTP requests , and as such maintains its own proxy settings. Thus, if the dashboard site is behind a proxy server, the proxy settings for the ServerXMLHTTP object must be specifically configured (via the proxycfg.exe utility).

For more information on proxy servers, refer to "Proxy Server Options," p. 198.

Note that while proxy servers may be deployed in both very small and enterprise-wide implementations, they are more typical in the latter. The improvements in performance and the security benefits of proxies usually outweigh the implementation and support costs in these larger deployments.

The Scope of Enterprise-Wide Data

The raw scope of data found in enterprise-wide SharePoint Portal Server deployments can overwhelm unprepared workspace Coordinators. Other challenges include managing the same document name found in multiple servers/locations, the impact of many files on the size and propagation of indexes, the pure number of links and pointers that might be required from the dashboard or workspace, and so on. Good access to data that is spread out or distributed across many servers and updated frequently requires regular crawling. In many cases, dedicated crawl servers are eventually if not initially deployed.

Firewalls may also present a unique challenge to crawling. That is, if you wish to crawl only internal sites but want to refrain from creating lots of rules (such as excluding searching anything that ends in ".com", ".net", ".edu", and so on), any proxy server that is also hosting index workspaces (if so configured) may need to be disabled.

The Variety of Data Formats

Yet another enterprise-specific challenge to deployment lies in the number and types of data formats ”documents created in Word, Excel, PowerPoint, and so on are pretty common. Documents created in Adobe PDF or CorelDraw formats may be less common, though. The key here is not so much in the absolute variety than in identifying the formats in the first place.

For example, at Global Corporation, it was determined early on that nearly all documents to be managed under the MANX Pilot's cross-functional general workspace consisted of the standard Microsoft Office Suite of document types. It was never envisioned that another format even existed. As it turned out, Lotus Notes was the preferred package of choice for maintaining email and drawing documents for one of the larger high-visibility organizations. This after-the-fact observation underscored the importance of determining all data types up front ”had this been more than a pilot, valuable collaboration opportunities as well as a huge source of raw data would have been overlooked. And the political ramifications of "missing" an organization's key data repository would have reduced the credibility of the entire project, not to mention the project sponsors.

One last point: Identifying the varieties of document formats also allows inclusion of the appropriate IFilters. As users pull down various files and documents and attempt to actually open them, the IFilter is used to determine which desktop application to invoke. It can then facilitate opening the document in its native format, as well as offer the ability to filter the document into its innate text while also identifying properties.

Thus, determining document formats up front clarifies where information is stored, and how the data will ultimately be used. This in turn helps crystallize deployment goals, both promoting cleaner enterprise integration and improving collaboration opportunities across the company.

Data and Language Challenges

We previously touched upon some of the language issues inherent to enterprise-wide SharePoint Portal Server deployments. The fact that not all documents residing in a particular geography may reside in the same language, and that search/best bet functionality may fade, are a few of the more obvious issues. Another is the fact that subscription notifications are generated only in the workspace language ”there is no support for separate client languages.

Not so obvious issues, though, are the benefits that exist like the following:

  • SharePoint Portal Server, by virtue of its support for six client-component languages (English, Japanese, German, French, Spanish, and Italian) provides great opportunities for collaborating across enterprise-wide deployments.

  • Support for content in any language (except bidirectional languages like Arabic) offers huge corporate search potential. For example, SharePoint Portal Server provides noise word files and thesaurus files for languages as diverse as Chinese-Simplified, Chinese-Traditional, Dutch, English-International, English-US, French, German, Italian, Japanese, Korean, Spanish, Swedish, and Thai.

  • SharePoint natively supports creating, say, a German version of SharePoint Portal Server on top of an English operating system installation.

  • Furthermore, in your German workspace you may create and maintain Japanese folders.

  • Japanese and other non-German content may be added to the dashboard site, too!

  • To really make things easy, Microsoft also made it possible to access the workspace using the client components of any of the six languages above. Thus, Americans can access the German workspace by using the English client components .

  • And finally, any user with the appropriate role can add folders, categories, document profiles, and other content in any content language.

So what's the problem? Simple ”a site can quickly become so language-neutral as to present ongoing management and maintenance challenges. It is therefore recommended that as few languages as appropriate to facilitate good collaboration and effective searching be employed wherever possible.

Backing Up and Safeguarding Your Data

A number of years ago, after designing and discussing a very large database implementation with a potential customer, my client remarked, "Nice design. But how do I back up those multiple terabytes of data?" Excellent question! He understood the essence of the good news/bad news issue at once ”plentiful data may be a great asset to your company, but without the facilities to protect it, back it up, and restore it if necessary, the data will begin to look more and more like a liability.

The enormous amounts of data that may be generated and managed underneath the guise of a "portal" is staggering. Think about the revisions of all of the documents being managed under the umbrella of version control, for example. And then there is all of the data mapped into multiple languages. And the copies of large data sets sitting in separate staging arrays to facilitate rapid retrieval. And the copies of documents downloaded to various client devices. In the end, all of this should be backed up and safeguarded.

Little Room for Error

One may argue that not all versions of a document with a long version history need be backed up again and again. Or one could make a case as to why operating system drive backups may be " skipped " once committed to tape. One day when you need to fall back to a known valid release of a critical operating procedure, though, or realize that a service-packed or patched OS was never actually backed up again, and the primary drive corrupted the mirror when it died, you will begin to understand the importance of backing up everything regularly.

You will not have the time to realize this until much later, of course, as you scramble to keep your job and re-create your precious data in some other manner but there will be hope.

Full Backups on the Rise

In any case, both technology and falling prices are driving the decision to back up data disks more frequently, and to do so more often in "full" rather than incremental modes (see Figure 22.8). The cost of tape drives, disk drives, and tape cartridges continues to drop. And the speed of tape drives has picked up considerably over the last 12 months as well.

Figure 22.8. Global adopted a 28-day backup cycle at each of their three large data centers, and leveraged best practices like rotating tapes offsite and performing regular backups of all disk partitions ”OS, SPS executables, logs, and data.

graphics/22fig08.jpg

Today, the question to back up, and how often, and in what manner, is still ROI-based as always. In the case of our fictional company, Global Corporation, backups of all data partitions occur as a natural function of data center operations ”each server is completely backed up to tape at least once a week. And all production resources undergo a full backup nightly. This diligence was a function of an easy ROI exercise ”it was determined that adding the SharePoint Portal Server production servers to the data center standard backup solution (an SAN-based enterprise tape library running Veritas Netbackup, in this case) cost the same as losing 72 hours worth of work. That is, the combined costs of a fibre Host Bus Adapter for each production server, a port in the switched fabric, SuperDLT tape to cover all SharePoint data resources, software licenses, and administrative overhead was less over the next three years than the average cost that would be associated with lost productivity over a 72- hour period. The business units were simply unwilling to lose 72 hours of work, and funding was developed at that point to cover the backup solution.

Duplicating Servers for Disaster Tolerance

Global did not have to go to a lot of trouble to safeguard their complete SharePoint Portal Server solution, though. By using the backup and restore scripts that SharePoint Portal Server automatically installs during the installation, the process to perform these key functions amounted to very little. In the end, they leveraged these scripts and automated them, so as to actually create duplicates of each server on another available disk partition within the data center. They also tested the ability to duplicate a copy of their master SharePoint Portal Server across their network to one of the other data centers, and found the process flawless though slow. Until budget money is available to perform this server duplication process more often, Global will plan on performing and testing remote duplicate copies for all three data centers on a monthly basis at minimum. The process is straightforward:

  • From Data Center A, back up the master server to a remote disk partition located in Data Center B.

  • From Data Center B, restore from the backup image just created to a standby server (sometime called their DR, or Disaster Recovery, server), also residing at Data Center B.

The same two steps are then performed for Data Centers B and C, and again for C and A. The standby server at each site is actually a server in the Development environment for the particular SharePoint Portal Server deployment/system landscape, having been previously "super- sized " to a certain extent in terms of RAM and CPU, so as to be capable of supporting the Production-level load.

During the month, Global also rotates and tests the ability to restore tapes across the three data centers, thereby providing yet another level of recoverability should one of the data centers be lost to a natural or man-made disaster. They test their escalation process that defines when, how, and by whose authority the decision is made to move SPS from its production infrastructure to an " unplanned downtime" status, to finally a new environment (DR site/solution).

Global understands that disaster tolerance is only as good as its last DR test, and strives therefore to prove on a scheduled basis that these processes and procedures are indeed effective. They simulate an actual disaster ”the loss of a complete data center ”each quarter. And to really push the limits of their DR plan, they randomly "kill off" a key member of the team responsible for failing the data center operations over to another site. In this way, Global ensures that their staff is cross-trained in DR and really prepared in an emergency to fail over (and afterward fail-back!) with a moment's notice.


                 
Top


Special Edition Using Microsoft SharePoint Portal Server
Special Edition Using Microsoft SharePoint Portal Server
ISBN: 0789725703
EAN: 2147483647
Year: 2002
Pages: 286

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net