Understanding and Deploying LDAP Directory Services > 24. Case Study: A Large University > Directory Service Design |
Directory Service DesignIn this section we discuss Big State's directory design and how it was developed. NeedsThe main applications driving Big State's directory deployment were the online phone book and the bigstate.edu email service. The phone book application required that the directory be populated with up-to-date white pages information about university faculty, staff, and students. Big State wanted to provide this information to internal and external users. The end users would be able to modify some of the information about themselves ; other information would come only from official university data sources. Because this application would be user -driven, response time would be important, but overall aggregate performance requirements would be minimal. For example, Big State expected the phone book application to receive on the order of a few thousand accesses per day at most. The bigstate.edu email service was designed to bring some order to the chaotic post-mainframe email environment emerging at Big State. Ever since users began leaving the mainframe system, literally dozens of local email systems had been popping up all over the Big State campus. Big State had no authority or desire to dictate the email systems used on campus ”it just wanted to provide the addressing consistency users enjoyed in the mainframe days while still maintaining the email diversity required by today's users. This was the purpose of the bigstate.edu email service, whose architecture is illustrated in Figure 24.1. Figure 24.1 The architecture of the bigstate.edu email service.The bigstate.edu email service was designed to give everyone at Big State a short, consistent, easy-to-remember email address in the form firstname . lastname @bigstate.edu or userid @bigstate.edu . This address follows a Big State email user during his or her entire association with the university, regardless of the local system on which the user might actually receive mail. Naturally, a necessary level of indirection is needed to insulate an email address from the user's actual location; the key component of this service is the directory. To deliver mail to email addresses based on names , the system required a name collision policy. To deliver to an address based on user ID, it required a mechanism for maintaining campus-wide unique user IDs. Another key feature of the service is the ability for end users to create groups or mailing lists that receive mail at the bigstate.edu domain, a feature users of the mainframe system enjoyed. To implement email groups with a directory service, the directory deployment team developed specialized client software for creating and updating groups, and it imposed additional schema and performance requirements. The overall performance requirements of the directory were driven by the most demanding directory application, which turned out to be the bigstate.edu email service. The Big State directory designers estimated that two- thirds of its 75,000 users would be active email users, receiving an average of three messages per day. If it takes three directory operations to deliver each piece of mail, you can see the load on the directory is substantial. Throwing mailing lists into the mix increased the directory load even more. At Big State, users create more than 40,000 mailing lists, some with thousands of members (although the average group contains approximately 150 members ). Delivering mail to these lists imposes a much greater load on the directory, and their creation and maintenance imposes an additional update load. This affects general directory performance as well as replication performance. DataThe primary drivers behind the directory service revolved around the directory-enabled applications slated for deployment. These applications had well-defined schema requirements, which made for simpler identification of the necessary data elements. The broadest data requirements came from the online phone book, which aims to provide access to a wide range of the usual white pages information, such as name, title, address, phone number, organization, and so on. More-focused requirements came from the bigstate.edu email service, which needs local email addresses for users. Data that might be useful for seeding the phone book application's white pages information was available from a number of sources within the university. In fact, Big State staff did a survey recently in which they identified 17 official university databases holding name and address information. Nevertheless, the directory designers identified two main sources of information: the university's Personnel database for faculty and staff and the Registrar's database for students. These databases were chosen for political as well as technical value. The Personnel database had one particularly attractive quality: It was already used to publish the printed university faculty and staff directory. This made it easy to convince the keepers of this data that it should be released for the purposes of creating an online directory. The student data was also released after a similar argument was made regarding student phone information already available in the campus locator phone service. One helpful practice was that the personnel department and the registrar both provided ways for users to request that their information be left out of any publications . This made everyone feel that enough choice had been given to users who did not want their information published.
After an agreement was reached on the data to populate the directory with, procedures were developed for actually obtaining the data, reformatting it correctly, and augmenting it for inclusion in the directory. Procedures were also developed for maintaining the data through subsequent feeds from the source database. These procedures are described in more detail later in this chapter. Another important kind of information required by the bigstate.edu email service was local email address information for users. This information proved to be much harder to obtain than the white pages information. The reason for this was simple: There is no centrally maintained database containing the required information. Instead, it is scattered around the campus in databases or applications maintained by local system administrators, and sometimes by users themselves. To overcome this problem and populate the directory with useful information, the Big State designers took two courses of action. First, they worked directly with administrators of the larger systems on campus to develop tools to extract email addresses from their systems. Second, they developed a program with which campus administrators could submit lists of email addresses to the directory. Campus administrators had an incentive to do this because their users would be able to use the bigstate.edu email service. (A future email service developed by Big State and described later in this chapter automatically updated user email addresses at user registration time.) Another category of data that Big State could store in the directory was administrative data. These data elements contain information used to manage other data elements. For example, Big State includes an expires attribute indicating when an entry scheduled for deletion will be removed. Another example is the noBatchUpdates attribute, which is used to indicate that a user does not wish his or her entry to be updated from official data sources. As advised in Chapter 6, "Data Design," the Big State directory designers created a table showing the information to be contained in the directory, its source, and who owns the information. This information determined the Big State directory data source diagram, which is shown in Figure 24.2. Figure 24.2 Big State directory data source diagram.SchemaThe schema used in the central directory is comprised of two basic sets of schema definitions, one representing people and the other groups. The schema for representing people is taken from the standard person schema definition, extended with a few extra fields required by the Big State deployment. For example, Big State added a universityID attribute to hold the university-wide unique identification number. This attribute is used as a common key with external data sources, allowing entries in the directory to be matched up with the corresponding data from an external source. Other new attributes were added to help keep track of various data handling and other procedures. For example, attributes are used for tracking data sources, noting the expiration time of entries, controlling whether entries are updated from corporate data sources, and other purposes. Attributes were also created to facilitate Big State's directory authentication scheme based on Kerberos, as described later in this chapter, as well as its proxy access control scheme. The schema for representing groups was created from scratch in conjunction with the design of the bigstate.edu mail routing software. This software was written and designed by Big State staff because no commercial software found at the time satisfied the requirements. The existing standard group schema definitions also proved to be inadequate. For example, the standard group definition requires every member of a group to have a directory entry, making it difficult to create mailing lists that include non-university members. The Big State group definition, on the other hand, allows for both directory and email members. The group schema definition designed by Big State is shown in Listing 24.1 (it is somewhat abbreviated and annotated for clarity). Listing 24.1 Big State group schema definitionobjectclass rfc822MailGroup requires objectClass, owner, # DN of the owner of the group cn # used to name the group entry allows associatedDomain, # domain name associated with the group joinable, # flag indicating if others can join mail, # email members member, # directory members memberOfGroup, # used for nested groups moderator, # moderator of the group requestsTo, # DN to receive list maintenance mail rfc822RequestsTo, # email to receive -request mail rfc822ErrorsTo, # email address for delivery reports errorsTo, # DN to receive delivery errors suppressNoEmailError, # flag indicating if no members are ok ... # other attributes
NamespaceTwo future requirements led Big State to the namespace design it chose. First, Big State wanted the directory to be extensible so it can store other kinds of objects (not just users and groups) as future applications arose. This requirement led Big State to choose the high-level partition-by-object-type namespace recommended in Chapter 8, "Namespace Design." Second, Big State imagined that at some later time it might want to partition and delegate portions of the directory to different units on campus. The medical campus and the College of Engineering were two likely candidates with the desire and necessary expertise to maintain portions of the directory. To facilitate this future possibility, Big State eschewed the advice to create a flat namespace; instead, it opted for a namespace in the people portion of the tree based on organizational hierarchy. To its credit, Big State let this hierarchy descend only one level. Because of difficulty in matching up the two data sources (one for faculty and staff, and one for students), Big State also decided to separate this data using the namespace (see Figure 24.3). Figure 24.3 The Big State directory namespace.To name individual people entries, Big State chose people's actual names. The names, taken from the official university staff and student databases, were constructed whenever possible to include a first name, middle initial, and last name (e.g., Barbara J Jensen). This was done in an effort to reduce the likelihood of name collisions. Recall that Big State wanted to reserve a unique email address based on name for each user. When collisions do occur, uniqueness is guaranteed through data maintenance procedures that append a number to each name. For example, the first Barbara J. Jensen who comes to the university would be given the number 1. If another Barbara J. Jensen arrives, she would be given the number 2. The two entries would be named using the relative distinguished names cn=Barbara J Jensen 1 and cn=Barbara J Jensen 2 . These data maintenance procedures turned out to be rather complicated, as described in "20-20 Hindsight: Data Population" earlier in the chapter. Although not part of the namespace, Big State also maintained a userid attribute that was unique across the directory. The attribute was populated and maintained from an existing database of campus-wide login names.
TopologyThe topology design of the Big State directory service was driven by the requirements of the applications. These applications need to search the people and group portions of the namespace, so those portions of the directory need to be kept together to make these searches efficient. Big State's network is relatively fast and well-connected, indicating no need to partition the directory for performance reasons. Although as mentioned earlier Big State had thoughts of delegating portions of the directory to other units on campus, there was no immediate need to do so. Therefore, Big State decided to keep the directory together in a single server, making the topology very simple. ReplicationTwo requirements drove the Big State directory replication design. The first requirement was for the service to always be up and available; the online phone book and email services depending on the directory are mission-critical and must be as available as possible. The second requirement was a certain high level of performance. The directory had to have sufficient capacity to support the directory-enabled applications using it, including the online phone book and email applications driving the directory's deployment, as well as the additional applications that would be deployed later. A replication architecture that would support this kind of incremental capacity increase was an explicit goal. In the Big State directory replication architecture, a single-master server handles updates and feeds directory replicas serving various directory-enabled applications (see Figure 24.4). Initial deployment plans called for two small replicas to serve the online phone book application and three large replicas to serve the three directory-enabled email machines providing the bigstate.edu mail service. Partitioning directory usage based on the type of application makes it easier to track directory usage, bring down parts of the service for maintenance without affecting the rest of the service, and increase capacity when needed. Figure 24.4 The Big State directory replication architecture.
Privacy and SecurityPrivacy and security were paramount concerns for the Big State directory designers. In a university, the general computing environment is relatively open, and Big State has no firewall to protect services such as the directory from the Internet at large. This means the directory service is open to access as well as attack. Unfortunately, the university population includes a large number of students, some of whom are notorious for having too much time, cleverness , and mischievous intent on their hands. These factors combine to produce an environment rife with an impressive array of threats to directory security and privacy. Because the Big State directory provides a white pages service, it contains personal information about directory users ”information that must be protected. The directory also serves various directory-enabled applications that are considered critical to the mission of the university. Making sure the applications have secure access to accurate directory data is a requirement. Most of the attributes held in the directory need to have their integrity protected. This means that directory clients must be assured that the information they read from the directory is authentic . A few attributes also need their privacy protected, such as the universityID attribute, which often contains a user's United States Social Security number. This attribute should be accessible only to directory administrators and select directory content administrators such as the help desk. In addition to privacy requirements, all attributes need to be protected from unauthorized tampering. The Big State directory designers constructed an access control scheme that separates the directory attributes into categories with different security requirements, meeting all these requirements. ACLs were constructed to protect each category appropriately. Another requirement was to support delegated administration. Many faculty and staff members do not have the time or expertise necessary to update their own information in the directory, and they wanted to delegate this task to a departmental administrator or secretary. The Big State directory designers constructed a proxy access control scheme to make this possible. This scheme worked by defining an ACL allowing any distinguished name listed in the special proxy attribute of an entry to have appropriate access to the entry. This way, users can control access to their own entries simply by adding an attribute value. There is no need to modify any directory ACLs. One important security issue was that Big State wanted to be able to leverage the existing campus Kerberos authentication service for the directory. By "kerberizing" the directory, the designers avoided designing a new authentication system and distributing and maintaining new passwords. Also, using Kerberos allowed the many thousands of Kerberos users on campus to begin their directory life with a password they already knew. This proved to be a great boon to directory use on campus. The only downside was that it required special development on both directory servers and clients. Big State found that even today no directory products support Kerberos out of the box, significantly adding to the cost and difficulty of maintaining and upgrading the existing service. Privacy is an equally important concern in the directory. In a university environment, users are accustomed to having more control over their personal information than they might have at a big corporation. Big State is no exception, so the directory designers set out to design a system that provides maximum flexibility for directory users. This included the ability for users to opt out of the directory entirely or to hide or publish various attributes such home address information. This capability was accomplished through the use of content-based ACLs, which is similar to the targetfilter capability of Netscape Directory Server described in Chapter 11, "Privacy and Security Design."
|
Index terms contained in this sectionadministrationdelegated Big State University Big State University case study data 2nd 3rd 4th 5th 6th 7th administrative email addresses 2nd obtaining online phonebook entries Personnel database political barriers privacy issues 2nd leveraging applications 2nd namespaces 2nd 3rd 4th 5th hierarchy design 2nd individual entries 2nd partition-by-object RDNs needs 2nd 3rd 4th email services 2nd 3rd 4th 5th 6th online phonebook privacy and security 2nd 3rd delegated administration user information replication 2nd 3rd 4th schema 2nd 3rd definition (listing) 2nd topology case studies Big State University data 2nd 3rd 4th 5th 6th 7th leveraging applications 2nd namespaces 2nd 3rd 4th 5th 6th needs 2nd 3rd 4th privacy and security 2nd 3rd 4th 5th replication 2nd 3rd 4th schema 2nd 3rd topology data Big State University case study 2nd 3rd 4th 5th 6th 7th administrative email addresses 2nd obtaining online phonebook entries Personnel database political barriers privacy issues 2nd delegated adminstration Big State University directories case studies Big State University 2nd 3rd 4th 5th 6th 7th 8th 9th 10th 11th 12th 13th 14th 15th 16th 17th 18th 19th 20th 21st 22nd 23rd 24th 25th 26th 27th 28th 29th 30th 31st 32nd listings schema definitions 2nd namespaces Big State University case study 2nd 3rd 4th 5th hierarchy design 2nd naming individual entries 2nd partition-by-object RDNs needs Big State University case study 2nd 3rd 4th email services 2nd 3rd 4th 5th 6th online phonebook privacy Big State University case study 2nd 3rd 4th 5th delegated adminstration leveraging applications 2nd user information replication Big State University case study 2nd 3rd 4th schema Big State University case study 2nd 3rd definition (listing) 2nd security Big State University case study 2nd 3rd delegated administration leveraging applications 2nd user information privacy Big State University case study 2nd topologies Big State University case study users security Big State University case study |
2002, O'Reilly & Associates, Inc. |