Scaling Up Your Search Solution

 <  Day Day Up  >  

The most sophisticated search functionality in the world is no good if it cannot support the number of users and the intensity of the search and indexing activity that you throw at it. Microsoft has addressed these scalability concerns with the latest generation of its search products in Windows 2003, Content Management Server, and SharePoint Portal Server version 2.

One of the design goals of SharePoint version 2 was to support one million users of a portal. For searching, the goal was to perform 95% of all queries in less than two seconds, and index up to 20 million documents (up from five million in version 1). This performance requires fast loading of thesaurus files and rapid crawling and indexing. SharePoint version 2 achieves at least double the indexing rate of version 1 in terms of documents per second. High availability and scalability may optimally be accomplished by using a number of servers to perform different roles in the portal. Indeed, you may need multiple servers for the search function alone.

SharePoint lets you assign multiple servers to roles as web servers, search servers, and index servers. For instance, if you had a small number of users but a vast amount of data to index, you would propagate many index servers but only require a small number or perhaps a single search server. Figure 14.3 shows the notional architecture for SharePoint search.

Figure 14.3. SharePoint Search Architecture (Source: Microsoft Corporation, Steve Tullis, "Enterprise Search With SharePoint Portal Server V2.")

graphics/14fig03.gif


A search starts with a user request that hits one of the portal web servers. Search requests are balanced across search servers from the web servers and therefore directed to an appropriate search server. All search servers have identical copies of each index through index propagation.

Each indexing server is devoted to crawling and indexing up to four content indexes. For instance, you could devote an indexing server to competitive company web sites, and another to file shares within your organization. Separating the indexing from the search service means that the performance of searches is not hindered by indexing activity, or vice versa.

BizTalk Server

Sometimes scalability presents obvious solutions to the analyst. Web servers, for instance, are generally scaled out to spread communications bandwidth and I/O (input-output) across a broader surface, while database servers are generally scaled up and maintained centrally . BizTalk presents no such pat answer, as its components can be scaled out and scaled up, depending on the pressure points that affect your system. Only by understanding the components of BizTalk Server and the business problems that can be created by potential bottlenecks and latency can you achieve an optimal solution for your budget. Therefore we look at both scaling up and scaling out options in this section.

One of the best sources for BizTalk scalability is in the MSDN article "Enhancing Performance and Scalability" at msdn.microsoft.com/library/default.asp?url=/library/en-us/bts_2002/htm/lat_perfscale_intro_fakz.asp.

BizTalk contains many components that can become bottlenecks in the system. In addition to the BizTalk server load and the multiple SQL Server databases required to track the messages and transactions, BizTalk depends on four transport services: HTTP/HTTPS, File, SMTP, and Message Queuing. Each of these can be scaled if necessary to support the desired transaction volume. The good news about this granularity of BizTalk configuration is that it offers so many options for scaling a solution, and it provides employment to analysts and network engineers as well as selling new hardware and software.

If you are starting with a single dedicated BizTalk Server, you can scale up by adding processors and memory. This approach keeps site management simple, but may be more costly than scaling a system horizontally or improving software architecture. Once the maximum capacity of the existing server is achieved, you must look elsewhere for further performance improvements.

The following steps are recommended to scale up a BizTalk Server:

  1. Increase the processor size (such as the Pentium III and its Xeon derivatives with large level II caches).

  2. Use symmetric multiprocessing (SMP) servers that accommodate up to eight CPUs.

  3. Use a faster disk system.

  4. Decrease file I/O and network bottlenecks.

As of late 2003, the following specific hardware and configurations are recommended for BizTalk Server:

  • A multiprocessor PIII Xeon MHz processor system (the highest MHz possible for maximum performance), capable of being upgraded to eight CPUs.

  • A 1- to 2-MB L2 processor cache ( increases parsing performance).

  • 1 GB of RAM (more if an organization is processing multiple megabyte documents).

  • Multiple 100-Mbps (megabits per second), or greater, network cards connected to 100-Mbps switch ports to increase network I/O throughput.

  • Multiple disks and controllers for message queuing and distributed transaction coordinator (DTC) file and log operations. Write DTC log operations to a central remote server to offload file I/O contention on the local BizTalk Server.

  • RAID 0 and RAID 1 disk configuration for better performance on the Shared Queue database and message queuing.

  • Multihomed network interface cards (NICs) in the BizTalk Servers to separate HTTP processes from the dedicated Microsoft SQL Server processes of the Shared Queue and BizTalk Messaging Management databases. Also, consider using a switch network to reduce the traffic to the network card.

These recommendations assume that BizTalk Server is running on a dedicated server. If the BizTalk Services are sharing the server with other application services, additional hardware is recommended, or you may want to move BizTalk to a dedicated server.

Scaling up is often the preferred approach for the BizTalk database server. See the section on scaling SQL Server in this chapter for general database performance information. Microsoft recommends that the BizTalk Messaging Management, Shared Queue, Tracking, and Orchestration Persistence databases be on separate disk channels, as this improves access to each of the databases. Consider the following as a minimum for acceptable performance of BizTalk Server databases: A multiprocessor PIII Xeon MHz processor system (the highest megahertz possible for maximum performance), capable of being upgraded to eight CPUs, with 1 GB of RAM (more if an organization is processing multiple megabyte documents).

Optimize the underlying Microsoft SQL Server databases and logs based on standard database best practices. If you initially plan to complete only a few transactions, you can install the databases on the same disk I/O channel. As more transactions are being processed , add disks and/or controllers to a server and move the databases to these new disk I/O channels. Additionally, an individual database can be moved to a new server.

To optimize the BizTalk Messaging Management database, the greatest benefit can come from installation of each of the databases on its own server or on its own disk channel. This prevents the Messaging Management database from hindering the performance of the Shared Queue or the Tracking databases.

The Tracking database should have more physical disks and additional disk space than the Shared Queue database, as the tracking database will end up with multiple copies of each message in the queue. To size the Tracking database, estimate the average document size for a single transaction and multiply the document size by the number of times the document will be logged to the Tracking database. Multiply the document storage space value by the estimated throughput requirement to determine the amount of space needed for the Tracking database.

Install the Tracking database on its own disk I/O channel due to the high volume of data that is written to it. A separate disk I/O channel is particularly important in heavy transaction environments. Follow similar optimization techniques for the Orchestration Persistence database.

Scaling BizTalk Server Horizontally

BizTalk Server 2002 implementation can benefit greatly from scaling out. While scaling up simplifies maintenance by minimizing the number of servers required, scaling horizontally provides the following benefits:

  • Cost effectiveness . At current prices, multiple inexpensive servers can produce higher return on investment for application performance.

  • Server fault-tolerance . When multiple servers share the workload, if one server fails, the other servers in the group can pick up the load.

  • Separation and optimization of the different components . Performance of BizTalk Services, the databases, and the transport services can be increased, and the administrator is allowed more control over how each of the services is configured.

  • Hardware optimization for each server and service . Servers can be optimized for the services they are running, so memory is allocated to servers where it is most needed, and I/O optimized throughout the server farm.

The downside of scaling out is some growth in the management and administrative burden , although Windows Server and related products greatly assist in this effort. You must also allow for server room space, air conditioning, and power to support the proliferation of servers, as well as including them in your backup and disaster recovery planning.

 <  Day Day Up  >  


Building Portals, Intranets, and Corporate Web Sites Using Microsoft Servers
Building Portals, Intranets, and Corporate Web Sites Using Microsoft Servers
ISBN: 0321159632
EAN: 2147483647
Year: 2004
Pages: 164

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net