Information Storage | MCSA[s]MCSE

In Exchange Server 2003, a service named the Information Store is responsible for data storage and management. It supports access by MAPI clients and by numerous Internet protocols via Internet Information Server. It also supports access through application programming interfaces (APIs) such as Collaboration Data Objects (CDO), ActiveX Data Objects (ADO), and the Active Directory Services Interface (ADSI). What all of this means is that the Exchange Information Store has become much more than a place where messages and data are stored. It has become a single repository in which an entire network of users and applications can store and manage information of just about any type. Since it holds all types of data and provides such varied access methods , Microsoft describes the Information Store in Server 2003 as the Web Store .

With this new version of Exchange, the support and management of protocols have been passed from the Exchange software itself to Internet Information Server. Separating the protocols from the storage system and providing other features, such as an Installable File System, front-end/back-end servers, and clustering support, have allowed Exchange Server 2003 to become much more robust and scalable than previous versions of Exchange.

Web Storage System

The Exchange Server 2003 Web Store combines features of the Web, the file system, and Exchange Server 2003 into a single, unified system for storing and accessing information. The Web Store serves as the sole repository for managing diverse types of information within a single infrastructure. In addition, almost every resource in the Web Store is now addressable through a solitary Uniform Resource Identifier (URI) location, commonly referred to a Uniform Resource Locator (URL) .

It is important to understand that the Web Store is not so much a specific entity or technology as it is a concept of how Exchange information is stored and used. As in previous versions of Exchange, information is still stored in databases and still managed by a service named the Information Store. Sometimes the storage system as a whole is called the Information Store, sometimes the Web Store. Both of these terms refer to the same system, but you may find them used in different situations based on context. For example, in the product documentation, Microsoft likes to call it the Web Store when they are pointing out new, web- related features. New features such as supporting multiple databases per server that can be grouped into storage groups makes Exchange all the more powerful. The Web Store moniker is really just a way to get across the idea that the information databases of Exchange can be used for more than just storing e-mail messages. They can be used to store almost any kind of information or document, and they can be accessed not only by e-mail clients but by web browsers and custom applications as well.

Exchange Databases

An Exchange Server 2003 database is actually a logical entity that represents two physical database files, a rich-text (EDB) file and a streaming media (STM) file . For example, a single mailbox database might consist of the files priv1.edb and priv1.stm. Each database incorporates both files, and Exchange Server 2003 treats them as a single unit. Furthermore, the reported Information Store size will be the combination of both the rich-text store and the native content store along with the transaction logs, which have the extension .log. Both types of data are stored in an Extensible Storage Engine (ESE) database format.

The rich-text file holds messages and works much like the database files in previous versions of Exchange Server. The streaming media file has been added to provide native support for many types of streaming media, including voice, audio, video, and others. To do this, the streaming media file is designed to store files as Multipurpose Internet Mail Extensions (MIME) content, a specification for formatting non-ASCII messages so that they can be sent over the Internet. This means that multimedia content can be delivered to the Exchange server using non-MAPI protocols in the media ‚ s native format, stored, and then passed along to clients without ever having to be converted into a MAPI-acceptable format. This minimizes the time needed to deliver the files to the client and thereby helps to reduce network traffic and also eliminates the risk of introducing errors into the media during a conversion process.

Multiple Databases and Storage Groups

Exchange Server 2003 provides support for multiple databases and storage groups on a single server. As outlined earlier in Chapter 1, ‚“Introduction to Microsoft Exchange, ‚½ Exchange Server 2003 Enterprise Edition allows up to five databases per storage group and up to four production storage groups per server. Each database must exist inside a storage group .

Although each instance of a database runs under the same Web Storage System process, you can mount or dismount individual databases on the fly. This means that you can take one database down for maintenance while others continue to service client requests . Also, each database is checked for consistency when the Web Storage System process starts. Should one database be unable to mount, other databases remain unaffected and will mount normally.

Each storage group is represented by a single instance of the ESE and shares a single set of transaction log files. Whenever a transaction occurs on an Exchange server, the responsible service first records the transaction in a transaction log . Using transaction logs allows for faster completion of the transaction than if the service had to immediately commit the transaction to a database, because the transaction log structure is much simpler than the database structure. Data is written to these log files sequentially as transactions occur. Regular database maintenance routines commit changes in the logs to the actual databases later, when system processes are idle. Consequently, the most current state of an Exchange service is represented by the EDB database and STM database, plus the current log files.

The checkpoint files are used to keep track of transactions that are committed to the database from a transaction log. Using checkpoint files ensures that transactions cannot be committed to a database more than once. Checkpoint files are named edb.chk and reside in the same directories as their log files and databases. Those transaction logs that have been committed to the database are cleared during a database backup (discussed further in Chapter 14, ‚“Backup and Recovery ‚½) or by circular logging if configured (discussed further in Chapter 9, ‚“Configuring the Information Store ‚½).

The use of multiple databases and storage groups allows you to plan your organization ‚ s data storage by classifying various types of data or assigning separate databases to more important users. You can learn more about using multiple databases and storage groups in Chapter 9.

Public Folders

Public folders provide centralized storage of just about any type of data that is meant to be accessed by multiple users in an organization. The primary use of public folders is to serve as a sort of discussion forum, allowing users to post and reply to messages in a setting where conversations are threaded by subject. However, public folders can also be used for much more, including the storage of Microsoft Office documents, administrative messages generated by Exchange Server, and even as the basis for advanced workflow applications.

Like other databases in Exchange, a public folder is actually composed of two database files ‚ a rich-text file and a streaming content file. The addition of the streaming content file means that websites can actually be hosted from within a public folder. The HTML, Active Server Pages (ASP), or ASP.NET files reside in the streaming file of the public folder store and are accessible from any web browser using simple URLs. Also, because Exchange stores the websites , pages in the sites can make use of Exchange-specific functionality such as calendars and messaging.

Also like other databases, Exchange Server 2003 supports the storage of multiple public folder stores on a single Exchange server. In addition, Exchange Server 2003 supports multiple public folder trees in an organization.

In versions of Exchange Server prior to Exchange 2000 Server, it was only possible to have one public folder tree , a hierarchy that forms the boundaries of the entire set of public folders available in the organization. Now, you can create multiple public folder trees and thus multiple sets of public folders. There is one caveat, however. When Exchange Server 2003 is installed, a default public folder tree, named All Public Folders, is created. This tree is accessible by all MAPI, IMAP4, NNTP (Network News Transfer Protocol), and web clients. Additional public folder trees will be available only to NNTP and HTTP clients. Additional trees are not accessible by any MAPI clients such as Office Outlook 2003. Additional trees such as these are intended for use as file repositories for groups or projects.

Learn more about the structure, creation, and management of public folders in Chapter 6, ‚“Using Public Folders. ‚½

Internet Information Services

One of the great strengths of Exchange Server 2003 lies in the way it supports standard Internet protocols for message transfer. In previous versions of Exchange, the Exchange Server software itself provided and managed the Internet protocols. Now, the responsibility of managing protocol support has been passed entirely to Internet Information Services (IIS) , a built-in component of Windows Server 2003. All Exchange Server 2003 protocols are hosted within the IIS process. When Exchange Server 2003 is installed, it enhances the SMTP service built into IIS with a more robust version capable of handling the demanding Exchange routing environment.

Exchange Server 2003 subsystems, such as protocols and storage, can now be placed on separate servers to improve scalability. For this to work, a fast, reliable method of exchanging information between IIS and the Exchange storage system, the Web Store, is needed. This need is met by a component named the Exchange Interprocess Communication Layer (ExIPC) . ExIPC is basically a high-performance queue that allows IIS and the Web Store to exchange data. Figure 2.3 illustrates the basic Exchange architecture.

Figure 2.3: Exchange Server 2003 architecture

The Information Store (a process named store.exe) is the Exchange service that manages the Information Store on an Exchange server. One instance of store.exe runs for each storage group on a server. store.exe manages processes such as store replication; maintains the ESE databases; and provides protocol stubs, interfaces that allow the ExIPC to transfer data between the IIS (a process named inetinfo.exe) and the Information Store. As you can see in Figure 2.3, a protocol stub exists for each protocol handled by IIS. The queuing process used by ExIPC is asynchronous, meaning that Exchange is able to allocate memory immediately after one portion of a process finishes.

Installable File System

The Installable File System (IFS) permits normal network client redirectors, such as Exchange, to share folders and items. This is a means of exposing the Exchange Information Store to users and applications on the network. Because your local computer can assign, or map, a drive letter to these resources, standard applications such as Windows Explorer and the Office 2003 suite can access resources in the Exchange Store. A user could, for example, map a drive letter to their mailbox or open a public folder from within Microsoft Word. The primary benefit of the IFS is that it allows clients to access Exchange data with no special software other than standard operating system components .

Note ‚

In Exchange 2000 Server, installation of the Exchange server created an M: drive that served as the portal into the Exchange Store for Windows applications. By default, the M: drive was shared using the share name BackOfficeStorage . This is no longer the case in clean or upgrade installations of Exchange Server 2003. This change was brought about to prevent file-level corruption through direct file access from virus scanning and backup/restoration operations. The Exchange Information Store can still be connected to at \BackOfficeStorage .

Front-End/Back-End Servers

Since Exchange Server 2003 separates its databases from the client access protocols (now managed by IIS), there is now a distinction between store management and protocol management. Exchange now allows administrators to configure front-end servers that handle client access and back-end servers that handle the databases themselves . The front-end server becomes the point of contact for all client applications.

MAPI clients must connect directly to a back-end server and cannot use a front-end server, but other types of clients (POP3, IMAP4, etc.) can. Clients that can connect to a front-end server do so using the following process:

The client connects to the front-end server and makes a request using a particular protocol.
The front-end server relays the request to the back-end server using the same protocol used by the client.
The back-end server returns the requested data.
The front-end server returns the data to the client.

This arrangement provides load balancing for servers and also creates a unified namespace for clients.

Clustering

An Exchange Server 2003 cluster consists of between two and eight connected computers referred to as nodes . These nodes share a common storage device, such as a RAID-5 array. Exchange Server 2003 can operate in either active/active clustering or active/passive clustering .

This provides a redundant hardware solution, since clients can connect to any node in the cluster rather than to just one computer. Clustering also provides fault tolerance. Should one node in the cluster fail, the Microsoft Clustering Service (MSCS) restarts or moves the services on the failed node to a functional node in the cluster. During scheduled maintenance of a node, an administrator can also manually move services to other nodes, thus reducing or eliminating any client downtime.

When active/active clustering is used, one instance of the clustered resource runs on each of the nodes in the cluster. Exchange Server 2003 supports active/active clustering using two nodes only. If one of the nodes fails, the instance of the clustered resource is transferred to the other node. Although it might seem that using active/active clustering is economical in that it allows you to always use your available servers, it has the disadvantage of not being nearly as reliable or scalable as active/passive clustering.

The preferred type of clustering is active/passive clustering, in which one or more nodes of the cluster is online providing service to clients and one or more nodes of the cluster is online and available to pick any resources from failed active servers. When one of the active nodes fails, the resources that were running on that node are failed over to the passive node. The passive node then changes its state to active and begins to service client requests. The downside to this clustering model is that the passive nodes may not be used for any other purpose during normal operations because they must remain absolutely available should a failover situation occur. In addition, all of the nodes must be configured identically to ensure that when failover occurs, no performance loss is experienced .

Note ‚

The number of nodes in your cluster is dependent upon the operating system on which the Exchange Server 2003 computer is installed. Active/passive clustering in Exchange Server 2003 is limited to two nodes when Exchange is installed on Windows 2000 Advanced Server Service Pack 4 (or later), four nodes when Exchange is installed on Windows 2000 Datacenter Server Service Pack 4 (or later), and eight nodes when Exchange is installed on Windows Server 2003 Enterprise Edition or Windows Server 2003 Datacenter Server Edition.

Full-Text Indexing

The Information Store creates and manages indexes for common fields using the Microsoft Search Service . In previous versions of Exchange, searches were conducted on every item in every folder, resulting in long search times for larger databases. With full-text indexing , every word in all mailboxes and public folders is indexed, making searches much faster and more accurate. The service can index all messages, attachments, Microsoft Office documents, HTML files, text files, and even PDF files. Users can also search on document properties of many types of data, including properties such as author, file size, and modification dates.

All searches are passed through Exchange Server 2003, which is responsible for handling security. If users do not have permissions to access particular objects, they are not allowed to bypass this using the Search Service.

An indexed database usually requires around 20 percent more available drive space than a nonindexed database, so you should allow for this when planning your Exchange server hardware. As an example, a 10-GB database that has full-text indexing configured will require an additional 2 GB or so of disk space for the index. You should also be aware that indexing large databases can be quite time-consuming . Because indexes are created for each database, creating multiple databases can often make the indexing process easier.

You ‚ ll learn more about how to configure indexing in Chapter 9.