Building a DCAR System


So, how do you go about building a DCAR system? The first steps were outlined already: find out who needs to be involved and what you have to comply with. The technical aspects of DCAR are often relatively easy when compared to those two tasks .

120 Days to Better DCAR

The first step in actually building the DCAR system is usually to set up a baseline. I like 120 days as an initial baseline time period, but you can make it shorter or longer if necessary. During that 120 days, your organization needs to identify who, when, and how you ll design a system to meet your requirements in five areas: retention, deletion, disclosure, supervision, and discovery. This might seem like a daunting task, especially for large companies. However, you don t have to target the entire organization at first.

You should start by identifying the highest risk candidates by evaluating the probability and impact of compliance for those individuals. Your corporate governance and risk management teams can help identify the people whose message traffic is most likely to be subject to these requirements. During this time period, you and your team should be evaluating products and technologies to ease your DCAR deployment.

Archiving Mailboxes And Message Journaling

Back in Chapter 9, Content Control, Monitoring, and Filtering, we talked about journaling messages. Exchange offers some limited journaling features; you can journal mail sent to or received by all mailboxes in a storage group , but this is a pretty broad spectrum that might not meet your actual requirements. Archiving a mailbox is significantly different from journaling e-mail; these two approaches can work together to your benefit.

Archiving services are usually defined as systems that automatically clean content out of a user s mailbox. The cleaning might be triggered by the passage of a specified interval, exceeding a preset storage quota limit, or accumulating a predetermined mass of messages. The best archiving systems also add more flexible rules, like tools for archiving based on message size , subject, or date. Whereas journaling captures all mail all the time, archiving allows users some amount of time (whatever interval elapses between message arrival and archiving) to delete some or all of the content they deem unnecessary. In some cases, users can move content out to a personal folder store (PST) file. In either case, the archiving service won t see the messages, so it won t have anything to store long term . This is an obvious drawback if your goal is to capture all mail, which is why archiving user e-mail is only one part of a complete DCAR system. You can t count on users to choose what e-mail to keep and what to delete; journaling will be required to keep you in compliance with whatever regulations, statutes, and case law apply to you. When you decide to journal e-mail, you ll probably use a combination of Exchange journaling and a service from your DCAR vendor.

Note  

To ensure your compliance archive is not being tampered with, you should transfer the contents of the journal mailboxes to write-once read-many (WORM) storage in near real time.

Caution  

Be sure to get the latest information from Microsoft and your DCAR vendor regarding BCC support by both systems, as this has been changing over the releases. Reviewing BCC information allows you to determine when an unscrupulous employee is subverting the open communication process.

Capturing Existing Messages

From a technical standpoint, it s simple to archive and retain your existing messages for compliance purposes. Once you have identified the users for whom e-mail must be retained for compliance purposes, consider putting them into a storage group or store that has journaling enabled. In this way, you aren t journaling e-mail on every server, storage group, and store within your Exchange environment. That s only the start, though: you might have to be able to show that your existing messages weren t tampered with or altered after being created.

There are four distinct components that have to be addressed to ensure records stored within your archiving system will carry the appropriate weight to be considered evidence within the legal system: your Microsoft Windows installation, your Exchange installation, your storage hardware, and your DCAR software. We will discuss storage and DCAR software here; the remainder of this book is dedicated to helping you harden your Exchange servers so that they re trustworthy, so we don t need to talk about that here.

Choosing Storage Hardware

Choosing the kind of storage you use for your DCAR system can be a tough decision. You might be required to implement a WORM system, in which archived data cannot be physically rewritten or modified, for compliance. Alternately, you might just need to store the data on normal read-write media. In either case you ll want to consider a few different options; some possibilities are listed in Table 17-2.

Table 17-2: Storage Types and Trade-Offs

Storage

Pro

Con

Storage Area Network/Fibre Channel NTFS drives

Easy to find and buy

Already implemented

Not WORM

Expensive

Direct Attached Storage/SCSI

NTFS Drives

Relatively cheap

Not very expandable

Not WORM

EMC Centera/Compliance Edition

Single-instance storage between compliance and user archives

Can be used with WORM

Highly scalable

Expensive

Proprietary

Very reliable

Network-Attached Storage (NetApp, Windows Storage Server, and so on)

Can be used with WORM

High performance

Highly scalable

Low per-GB storage costs

Generally not usable for Exchange storage

Might require additional management tools and overhead

Might not support your preferred archive solution

Magnetic tape

Common and well- understood

Scalable

High archiving costs

Relatively low performance

Not WORM

Physically bulky

Magneto-optical media

WORM meets most archiving requirements

Expensive

Relatively low performance

High per-GB costs

Physically bulky

Notice that even though I listed magnetic tape ”the traditional Exchange backup medium ”it wouldn t be anywhere near my first choice for a DCAR system. Many organizations have years of content backed up to tape; typically daily and monthly backups of Exchange are kept around for several years . In these cases, it s clear to see that a subpoena for evidence would prevent a company from legitimately destroying any of those tapes. That means if you know you have compliance requirements, by using tape for your archive medium you re signing your company up for an extremely expensive process of tracking, shipping, archiving, and protecting those tapes, usually in an offsite storage facility. If your company has such a mechanism for keeping tape backups onsite or offsite, you should be prepared for the inevitable ”recovering tapes into your DCAR system for legal discovery purposes. You will probably find it more cost-effective to use another storage medium unless you have very long time horizons.

Evaluating DCAR Products

As you begin to evaluate your software choices for a DCAR system, you should carefully consider what you need the software to do. If you re only interested in journaling every message in or out, Exchange s built-in functionality might suffice, but most sites with compliance requirements find that they cannot stop there. First, begin with some basic questions that revolve around the types of supported storage methods ; these come first because in many environments, your storage hardware requirements will be dictated by your compliance requirements, and your hardware choice will influence what software you end up with.

  • Does the software support plain old disk volumes , either locally or on storage area networks (SANs)?

  • Does the software support the use of WORM or optical media? (This involves things like not trying to store a changing catalog on a volume type that can t be changed once written.)

  • What underlying database does it use to manage the DCAR system, and where can that database be stored (for example, can you have a centralized database server that all DCAR-enabled servers in your organization can share)?

  • Does the system support native Windows security for applying access controls to archived material?

  • Can stored data be compressed? Can you turn off any compression during storage?

  • Once data has been archived, can you move it from one storage type to another without having to rearchive it? For example, it s handy to be able to archive to direct-attached storage (fast, but expensive) then move it to tape or a Network-Attached Storage (NAS) filer later.

Once you ve satisfied yourself that your potential solutions will support the storage systems you need to use, here are some additional questions to consider:

  • What indexing methods are available with the software? Can you create full- text or keyword indexes, or are you limited to searching by message header fields like recipient, sender, or subject line?

  • Does the system support Exchange migration? Can it support mixed-mode Exchange organizations with multiple versions of Exchange active at once?

  • What versions of Microsoft Windows Server are supported?

  • Does the software support double-byte character systems like Japanese and Korean?

Next , consider how you can choose what gets archived. This is important, because in many environments you don t actually need to archive every message, just some of them. Archiving things you don t need costs you money now, but it costs even more money over time. Check to find out if:

  • The archive service targets objects based on received time, age, quota, message size, or other useful criteria (the more criteria, the more flexible archiving will be).

  • The software can filter out specified message types, like read receipts or appointment requests .

  • It s possible to leave shortcuts in place of archived items, and to remove them once the message is expunged from the archive.

  • You can define retention policies that control when content is deleted. This is critical, because without flexible retention policies your archive is doomed to grow forever, taking your IT budget with it

  • Can users selectively remove items from their mailbox archive? If so, does this removal leave an audit trail?

  • Can access to mailbox archives be delegated, like mailboxes are delegated in Exchange?

Of course, a real DCAR policy has to include all of those PST files that are probably squirreled away on your client workstations and file servers. Corralling, collecting, and importing the plethora of PST files on your company s network is a daunting task. However, it s very important to your legal, compliance, and audit groups that these storage mechanisms are removed from service because PST files can contain information that is well outside the scope of your company s retention policy, like a five-year-old message that should have been shredded two years ago.

Note  

Make sure your vendor provides a way to automatically disable PST use after you ve migrated . Optionally you can set it by following the steps in Microsoft Knowledge Base article 258277.

If you re using PST files, you should consider the following questions:

  • Can server-based PSTs be imported into the archive? (Better yet, is there a way to scan file servers for PSTs and import them automatically?)

  • Can PST contents be automatically copied to individual users archives?

  • Can PSTs be manually mapped to users archives?

  • What happens to PSTs after they are migrated into the archive: are they compressed or deleted?

  • What methods can the users leverage to locate content that was in their PST files? Can they search the archive? Can they browse messages? Can shortcuts be used?

After you ve identified the criteria for moving content into your DCAR system, you should consider the operational impacts of having all that content moving around. Select a system with the right reporting and inter- and intradepartment billing capabilities by asking these questions:

  • Does the system provide a report mode or evaluation mode before first archiving? This is important because it shows you what would happen before it actually takes place, giving you a quick way to see what the effects will be before you start an archive job.

  • Does the system provide reports for the overall archive size and content?

  • Can you create reports based only on selected Exchange servers, recipients, groups, domains, or organizational units (OUs)?

  • Do the reports contain volumes and capacity?

  • Can you get trend reports for given periods to gauge the growth of the DCAR system for the next fiscal year?

  • Can you query the database or export report information into comma- separated value (CSV) or tab-separated value (TSV) file formats?

  • Can you measure store performance using native Windows performance management tools? Are third-party monitoring and management tools supported? If so, which ones?

Once you ve gotten through the server features, you ll need to evaluate the impact of your proposed DCAR solutions on the clients . Any time you have to touch user desktops, the cost of maintaining your solution goes up, and the more you have to fiddle with these machines the more expensive the fiddling becomes. You ll want to know the following:

  • Is any client software required?

  • If a client is required, how can it be deployed? Can it be deployed using Active Directory group policy? Systems Management Server? Exchange forms deployment?

  • Can items be viewed from the archive in the existing Microsoft Outlook client, or do users need a special viewer?

  • Can users restore archived items back into their mailboxes, or is this only available to administrators?

  • Can items from public folders be reviewed and retrieved?

  • Can users view and reply to messages that are stored in the archive?

  • When searching the archive, do searches work within Outlook? Is there a separate client or Web interface that users have to use? If so, does the client or browser interface allow users to reply to messages, open attachments, and perform other basic tasks?

  • Do users of Microsoft Outlook Web Access have access to the same features and functionality as regular Outlook users do?

  • Does the client support Outlook s offline folder (OST) files? What about the Microsoft Office Outlook 2003 caching Exchange mode?

start sidebar
A Sample Case: Woodgrove Bank

In the last few years, financial services companies have come under scrutiny for unscrupulous behavior. The court of public opinion and the U.S. legal system have both tried and successfully convicted numerous officers and directors of the largest U.S. firms. In many cases these firms must deploy some method of discovery system to expedite the legal production process. However, for companies that have the last five years of e-mail on backup tape, this can be an ongoing nightmare of mammoth proportions .

In the case of Woodgrove Bank, the company had accumulated about four years of backup tapes that needed to be searched for discovery purposes. The bank called on the services and software of KVS, Inc. to produce a solution design and implement software that would ease the burden of discovering and producing legal evidence.

The restored Exchange environment consisted of a separate Exchange restore domain. There were 10 restore servers running Exchange used in the legal restore project that were merged with KVS Enterprise Vault servers to create a new Active Directory forest. The Exchange servers were configured as follows :

  • Server with dual 1 GHz processors, 512 MB of RAM with 18 GB mirrored system disks.

  • Restore storage is direct attached storage (DAS).

    To facilitate this restore project, the KVS Enterprise Vault systems are put into a restore mode, allowing them to archive items newer than a particular date. This ensures that all items restored with a date later than that specified will be immediately archived.

    The company chose a five-server solution (five Building Block servers) for KVS Enterprise Vault to run on. The company provided the following hardware to be used in their Enterprise Vault legal restore project implementation:

  • Server with quad 2.0 GHz Xeon processors, 4 GB of RAM with 18 GB mirrored system disks.

  • SAN storage solution, 60 TB of storage, partitioned into 1TB units.

  • Microsoft SQL Server located on a separate system; the back-end storage solution was SAN.

    Data management is critical in a multiyear restore process; therefore, flexibility is paramount. In this case, once a volume was full it was closed and made read only. Archived data was then targeted at the next allocated volume. Each server had a maximum of 11 TB of storage to manage. In addition, to reduce processing overhead, a special file management feature of KVS Enterprise Vault called collections was used to collect many small files together, such that discovery, business continuance and disaster recovery (BC/DR) would be easier. In this case of this company, they also recovered all dumpster items (deleted items recovery) into the DCAR.

    The process is still underway and proving itself as a successful attempt to build a DCAR using legacy tape data that was not content indexed or readily accessible to legal or regulators. Looking back, they chose a KVS solution because of its proven scalability in the enterprise and services expertise around this type of restore. Moreover, Enterprise Vault offers the industry s highest level of flexibility in archiving and continues to be the leading e-mail archiving vendor ( Gartner s Email Active-Archiving Magic Quadrant , Gartner Group, November 2003). Even though the bank chose the best software and solution, the project is not without problems. Issues around work hours to complete the recovery, no-BCC information, and unknown distribution list memberships have caused problems along the way. Specifically, with BCCs, when a user receives a message as a BCC, there is no easy way to determine that, except by checking the sender s Sent Items folder. However, if the sender didn t save a copy, then the recipient has to be assumed as the user whose inbox is currently holding the message. Needless to say, proving that can be difficult in a court of law.

end sidebar
 

Finally, managing the solution is straightforward once the content is in the Enterprise Vault. The process of getting it there includes setting separate Exchange systems within a domain, normal tape restore of the databases, and configuring the Enterprise Vault for archive recovery mode. Archive recovery allows the system to archive items newer than the specified date; this ensures items that predate any retention policies are not recovered into the DCAR system.

Finishing Touches

After you ve decided on your software solution and hardware storage solution, you need to consider the ramifications of having such a system. There are four challenge areas around the ongoing management of DCAR: indexing, expiry, compliance, and discovery.

Indexing

There s no way around it: indexing can consume lots of CPU cycles and storage space. However, not indexing makes it very difficult for users to find the material they re looking for. Some software maintains a single index for the entire user community and discovery and compliance archive. In these cases managing permissions within the index can be cumbersome, and the size of the index file could become a problem ”imagine the size of a catalog or index for 5 TB of mail data! A single large index is much more vulnerable to corruption, too, and because a damaged index impacts the entire organization s compliance posture , this is something to avoid if possible.

Per-user indexes are better for most applications; in addition, having a separate index for the discovery and compliance archives is valuable . In this case, you re trading file count (and size) for speed and flexibility. For example, when indexing a single document stored in two separate mailboxes, you might index it twice. This should be considered against the risks of having a monolithic index become corrupted, or the potential that someone will be able to exploit loose permissions to see something he or she shouldn t. Along with the small size of multiple indexes, if you are forced to rebuild an index for a user s mailbox, you don t have the entire company offline while you are building it.

Another factor that influences the size of your index files is the amount of data that s indexed. In the case of KVS Enterprise Vault, there are three separate index levels:

  • Brief The index created by Enterprise Vault will enable searches on the following attributes of each item: author, subject, created date, expiry date, file extension, retention category, and original location. If a search matches an attachment to an item, the search result contains both the main item and the attachment. This index is approximately 3 percent of the size of the data that is indexed (3 percent of 5 TB is 150 MB).

  • Medium At the medium level, the index includes everything from the brief level; in addition, keyword (but not phrase) searches are enabled on the content of each item, excluding phrase searches. This index is approximately 8 percent of the total discovery and compliance archive.

  • Full Full indexing contains all of the items from the medium level, plus it adds the ability to do phrase searching. This index is approximately 12 percent of the total discovery and compliance archive, or four times larger than brief indexing.

What about attachment indexing? Most of the software on the market will index the attachments, which is good because systems that don t index attachments are basically worthless in discovery proceedings . Osterman Research determined that 60 percent of what an employee needs to do his or her job each day is in his or her e-mail. Therefore, you legal team will realize that 60 percent of the discoverable information is in e-mail. KVS, Inc. has determined that approximately 85 percent of a message store size is made up of approximately 20 percent of the overall messages, those with attachments. Be sure to ask your software vendor for a list of attachment types they support for indexing, too.

Expiring Materials

Once you have a full-fledged archive, you ll need to consider when to have items deleted or not deleted. The decision about what gets deleted is fraught with legal implications (as I discuss in the next section). Your software vendor should allow you to specify multiple, different retention policies to apply to your discovery and compliance archive and your users mailboxes. Also, some vendors have a selection to allow or disallow storage expiry in your discovery and compliance archive.

You ll need a lot of flexibility in retention categories, such as being able to set a default category for a particular site or allowing users to set a category for a specific folder. In some cases, KVS customers have created Outlook folders that archive their contents into a specific archive, as well as assign a specific category.

In addition to defining your categories, you ll also need to control when the expiration occurs so you don t impact performance during day-to-day operations. Most software vendors should provide a mechanism for scheduling the expiration of content. Beyond scheduling and retention categories, be sure you are meeting with your cross-division team to help define your retention categories in line with the legal and regulatory framework your company is accountable toward.

Compliance: Controls and Review

A large portion of the IT world might think that compliance is limited only to financial firms. However, compliance laws and regulations impact a number of organizations including those in health care, telecommunications, government, automotive, lumber, and pharmaceuticals , among others. With so many different groups being impacted, it is important to evaluate a DCAR that provides a compliance view into the stored information.

Compliance differs largely from normal searches or discovery, in that it requires a positive audit trail, a portion of e-mail screening, and can typically be done postdelivery. Some companies are evaluating real-time prescreening capabilities, but should be wary of the business communications impact. Compliance screening might be burdensome to larger enterprises with thousands of employees . In a small company with 1000 users, there are nearly 1 million communications paths between users.

A good compliance environment maintains a chain of custody log on what is in a result set and what portion of the result set has been audited by a compliance department. Therefore, detailed reports are only required during regulatory audit or legal attestation to a process. Therefore, most compliance tools maintain an executive level view of what has been reviewed, what hasn t been reviewed, and items that are questionable.

In addition, because each e-mail is an item of record, the original e-mail must be reviewed and managed, unchanged from the original format.

We Are Getting Sued, Now What?

If your organization is the subject litigation, do not delete anything. Willful destruction of evidence is not tolerated (see McCabe v. BAT ). You should consider putting your discovery and journal archives on hold from storage expiry and setting business-critical retention categories to a hold status for deletion.

You ll need a tool to manage your discovery environment. Most search interfaces are designed for users of individual mailboxes, not legal discovery. The purpose for having something designed for legal discovery is the ability to produce evidence for a judge, regulatory commission, or opposing council. Once produced, you can save the discovery and assign permissions to your legal staff so they can determine which evidence might be incriminating. The legal discovery tools for e-mail are getting better as more and more cases are demanding electronic mail as evidence.

Note  

Remember that PST migration? Well, judges, lawyers , and regulatory agencies know those files are out there and might require a full discovery of those, too.

In some scenarios, you might have to do a discovery on the journal and a user s mailbox. All in all, discovery is a process and should not be managed using a search interface designed for end users mailbox archiving. At the end of the day, always think common sense, as will a jury of your peers.




Secure Messaging with Microsoft Exchange Server 2003
Secure Messaging with MicrosoftВ® Exchange Server 2003 (Pro-Other)
ISBN: 0735619905
EAN: 2147483647
Year: 2004
Pages: 189

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net