Data Security Primer | Microsoft SQL Server 2005: The Complete Reference: Full Coverage of all New and Improved Features

I find it necessary to actually define the difference between data integrity and data security. No one defines the difference better than database guru C.J.Date does: “Security means protecting the data against unauthorized users; integrity means protecting it against authorized users.” With the data integrity definition out of the way, we can deal with the subject of data security, unencumbered by misconceptions.

Objects and Ownership

The meaning of data security comes into focus when you explore the concept and philosophy of ownership. If you have a background in network or general computer security, the concept of data ownership will be a lot clearer because objects and ownership are at the root of network and information technology security-especially on Windows operating systems and the NT file system (NTFS).

It also helps that your data “Gestalt” is formed around the concept of objects. In other words, everything that makes up your data environment comprises objects. As long as a user (human or otherwise) possesses a data object, the attributes of the objects fall under certain laws of ownership, and ownership is what makes us human (and a good reason why Communism has essentially failed).

Everything in a database management system is an object-from the service providers; to the Network-Libraries; to collections of stored procedures, functions, and more. The database is also an object. It has attributes, such as tables, that are objects. A table is also an object that contains attributes. The attributes of a table are the rows, or tuples, which are themselves objects with attributes.

If you keep going down the “object hierarchy,” you will keep discovering objects, until you get down to the world of subatomic matter, which is also understood in terms of objects and ownership. We can think of the hierarchy as an object chain, an important concept in SQL Server security we will later discuss.

Data objects, as a collection, become information at a point when the combined objects and their collections, seen as the sum of their parts, yield facts. And information, suggesting a substance of value, is also an object of ownership. Ownership, no doubt, was the reason “databases” were created hundreds of years ago, when people needed a system determining ownership. They may have used caves, or clay pots, or holes in a tree as repositories, but databases they were, no more and no less.

Trust

Adopting the view that objects have owners, software engineers can enforce rules of ownership on objects much as banks protect money. You are not the owner of the object until you can prove you are, via identity. Sure, if you deposit money in a bank account, you are the de facto owner of it because you created the account and placed the money in it. You created the credit balance with the bank. But you do not get access to the money once it is placed in the custody of the bank unless you can prove to the bank clerk that you are the owner. Like the bank teller, the database needs to know that you are who you say you are.

This brings us to the concept of trust. If a system, such as a database, is able to validate your identity (you are who you say you are), you are allowed to interoperate or interact with the system under a certain level of trust. Each system rates its level of trust differently. Some systems, even people, trust a person fully (which you may argue is foolish). Other systems delegate levels of trust. The higher the trust relationship, the more rights of access you have.

Rights and Permissions

The more you are trusted, the more access you will get to objects and information (such as by being trusted with the delete permission). For example if you can be fully trusted, you are given access to information and functionality, both on a network and in a DBMS. The access is a permission, which is not a finite quantity; the rule is the need-to-know. You can be assigned certain levels of permission on a need-to-know basis. In a database, you might be permitted to read some data but not delete it. On a file system, you might have permission to read some files but not to delete them. The permissions model is illustrated in Figure 5–1.

image from book
Figure 5–1: Permissions protect the object

Permissions are not the same thing as rights. We all have rights-given to us by God and country and our systems administrators. Often those rights (even the God-given ones) are withheld from us. If you have ever had cause, as I have, to argue for or claim your human rights, you will know that the difference between rights and permissions is as profound as night is from day.

The Windows Server 2003 security system distinguishes rights from permissions. The “right to log on locally” may be assigned to you. But having permission to log on locally is another matter altogether because permission to do so overrides the right. It is important as a DBA to distinguish between the two; it will make your life a lot easier when dealing with people who like to show muscle.

What kind of security soup does this all boil down to? To sum up, everything in a database, including the database, is an object. Objects have owners, and owners are usually allowed to do whatever they want to the objects. But just because you think you have the right to do whatever you want, with what you own, that doesn’t mean you can. Society in general, and database administrators in particular, have the final word on what’s permitted. The rules of society do not permit you to beat up your children, just because you “own” them. And business and database rules might forbid you from deleting databases and data, whether you own them or not.

Change of Ownership, Schemas and Security

Change or transfer of object ownership is perfectly feasible in computer and software engineering. Transferring ownership of databases objects is as simple as transferring the ownership of a file, as you will soon see. However, keep in mind that ownership can usually only be given or transferred. It can only be taken in most cases by a system or database administrator, or a security principal (a user or computer account) with the highest level of trust and permission of a security system, such as a file system or DBMS.

Databases and their objects can now be managed under a namespace mechanism that is granted ownership of the database and all its objects in SQL Server. Users and roles are given access to schema namespaces so that they can manage a database and its objects without exclusive ownership of the objects, as was the case with SQL Server 2000.

If a user leaves town, you can simply transfer ownership of the schema to another user without affecting the database and its objects. Schemas are created in SQL Server Management Studio (SSMS) under the Security Page in Database Properties. Or they can be created and managed in T-SQL with the CREATE SCHEMA statement. Here is an example:

 CREATE SCHEMA name AUTHORIZATION owner

You can also create and manage schemas interactively under the Security node of a database.

Authentication

You cannot just walk in off the street and lay claim to property that is in the trust of an authority, be it a bank, a network folder, or a database. You first have to prove you are the owner-not the same thing as proving identity.

Ownership can only be contested, however, after identity is verified-and this is called authentication. The simplest mechanism for proving identity to a network, proving that you are who you say you are, is by providing a user identification (user ID) and the password that goes with it. This combination of elements is often called the “ID-password pair.” Once your identity can be proved, you are assigned a level of trust with which to go about your business, in the bank, on the network, in the database. But you are still prevented from accessing objects you have no permission to access.

Note

Windows Server 2003 provides several new mechanisms beyond the simple ID-password pair for proving identity. The operating system now has support for smart card readers, as well as for biometric devices that scan retinas, fingerprints, and voiceprints, and can verify signatures, hand geometry, and so on. These devices are underpinned by certificates that store electric signatures and personal identification numbers. Using certificate-driven security (encryption services, keys, and so on) requires a certificate auihority (the server architecture that services a public key infrastructure, or PKI).

Checking that the security principal has ownership of a certain object confirms or rejects ownership. Usually, if you create an object, you have full rights to it. But this rule cannot be stretched too far. Just because you created an object on the network, such as an important document or database information, that does not give you the right to delete it. In fact, some employee manuals go so far as to state that anything you create automatically belongs to the corporation and that you have to comply with policy with respect to deleting or altering what you created.

I was involved in a case recently when an employee managed to encrypt a database and refused to hand over the password when her employment was terminated. It took us half a morning to crack the password, and the company had the employee arrested for attempted destruction of property

To better understand the technical issues of ownership and access control, it helps to discuss the business (and philosophical) aspects first.

Access Control

Understanding access control is paramount if you strive to be an effective system administrator, especially a DBA who is mandated to protect information assets of the company For many enterprises today, information is all they own; it’s certainly more valuable than what can be carried off on the back of a truck.

For the most part, the system administrator implements access control as directed by senior management, such as change control boards (CCBs) and security councils. For example, you’ll get a request to allow an accounts payable clerk named Sally to access a certain database and to give her certain abilities in that database, such as the ability to update information. However, the levels of access control and access control management are the domain of the system administrator. In other words, you are expected to know how to implement the controls, and screwing up could cost you your job.

At all change control meetings I have attended over the years, I have seldom been told to “provide access to Sally but make sure she cannot write checks to herself.” You are expected to make sure that Sally cannot write checks to herself, but you also need to ensure that she can manage accounts payable without hindrance. Remember, you are being held in the highest of trust and being paid well in return (I have heard more about CPAs embezzling from the company than about system administrators or DBAs doing so).

There’s a lot of power in the hands of the DBA-a good reason why effective (and secure) security plans should be implemented and followed to the letter. But don’t go anywhere; you will be getting a heads-up on that in this chapter whether you like it or not.

Access is not a technical requirement; rather, it derives from business policy and rules. So you need to put aside your technical cravings and consider some basic database access control management. (In a large company, though, you’ll probably not need to make or manage access policy, just implement it.)

Levels of Security

You can divide access control policy into several classifications. I like to use what I learned in military intelligence, which classifies data according to its levels of sensitivity, using the labels confidential, restricted, secret, and top secret. Data that is sensitive, but not damaging, can be classified as confidential. Everyone in the enterprise is given access to this level. You are required to consider everything you learn about the business and its affairs as confidential. The drug testing procedures of a company could be considered confidential information.

The next level is restricted, which means that a subset of the members of the organization have access to information that could cause problems if the information became widely known. Restricted information could include short-term information, or information a company feels should be disseminated to a larger collection of its employees, such as the percentage increases for annual bonuses or product launch plans.

The definition of a secret has long caused aggressive debate. In a company, a secret is information shared by a very small number of employees (can you really share a secret with yourself?). For example, the discovery of a new formula might be considered secret, known perhaps only by the most senior executives and immediate product staff. Secret information is provided on a need-to-know basis. You are not given secret information unless you need it to perform your duties. And you need to be proven trustworthy first.

And finally we deal with the stuff of which spy novels are made, top secret. Information classified as top secret could be disastrous for a company if it were revealed. Such information could be corporate credit cards, bank accounts, digital signatures, cash assets, patent plans, blueprints, algorithms, information that could be used in the public trading of a corporation’s shares, and so on.

I consider the passwords of employees as top secret information. If an employee becomes loose with his or her password, he or she should be disciplined under rules that deal with the unauthorized divulgence of top secret information. This is especially important when SQL Server is providing access on the basis of Windows/Active Directory authentication alone. Figure 5–2 provides a different “view” on the restriction levels-a pyramid whose pinnacle represents the smallest segment of people that have access to the information and whose base represent the largest segment of people that have access to the information.

image from book
Figure 5–2: The pinnacle represents the smallest segment of people that have access to the information

How do you translate the level discussed here to databases? I consider any access to the data in the databases to be confidential, which means that everyone and anything that has access to the DBMS must meet requirements for access to confidential information. They need to be authenticated to the DBMS in some way that enforces the policy of the enterprise. This is achieved by having a user ID and a password, or some recognized means of obtaining trusted authentication and thus the trust of the DBMS.

While all data in the DBMS is considered confidential, certain areas of it could be considered more sensitive, upgraded to restricted, confidential, secret, and top secret. Looking at the table in Figure 5–3, we can easily identify certain columns as restricted, such as job descriptions (JobDesc); certain columns as secret, such as annual bonuses (AnnBon); and certain columns as top secret, such as drug tests (DrugTest).

image from book
Figure 5–3: Columns classified according to the sensitivity of the data

The safest means of protecting the top secret information would be to make DrugTest a stand-alone table in the database and restrict access to it accordingly, but SQL Server 2005 provides sufficient mechanisms (views, stored procedure or select permissions, and the like) to ensure that the drug test results are adequately protected. You might balk at the idea of labeling your information like a CIA or MI5 agent, but you need to think along these lines if you don’t want to end your career by “acts of omission.” We’ll get back to this later when we discuss the data security plan.

This brings us to another level in our discussion of access control: distinguishing between the protection of information by users or employees, and the protection of data by users or employees. I agree it borders on the integrity versus security issue; however, you will need to protect data used by members of trusted groups from accidental or even intentional damage. And often the lines of functionality will be blurred, because you will use the same constraints and restrictions to enforce both security policy and data integrity.

Pop Quiz: A breakdown in what at Los Alamos led to the risk of losing nuclear secrets, when two hard disks containing the data were found missing from the vault? Choose one of the following:

Data integrity?
Data security?
Both of the above?

Point taken?

Windows Security 101

The NT File System (NTFS) protects objects by verifying that a security principal has access to an object. The file system also verifies what level of access has been granted to a security principal, such as read, write, delete, and execute. Deny is also an access level, although it prevents access to the object. So how does the file system know who you are without having to reauthenticate you every time you need access to an object? And how does it know what level of access your authentication carries? On Windows domains, this is achieved using access tokens, security identifiers (SIDs), and access control lists (ACLs).

Access Tokens and the Security Identifier (SID)

An access token is a “badge of trust,” which acts as a proxy, assigned to you after you successfully authenticate to a Windows Server or Active Directory. The token is the means by which the security system grants you right of passage on the network or in and across the domain, although the passage and access to resources is carried out by the system on your behalf, by the token. A token’s right of passage is not open-ended or unlimited. It can be revoked by a security system, or the administrator, and is revoked when you log out.

You can compare the token to an access badge for passage in a protected area. The holder of the badge or token is given the right of free passage around the offices of the enterprise. On a Windows domain, it gives you the right of passage around the network or to connect to the database management system, which is SQL Server, no doubt.

The access token is the “token” by which your continued trust is measured and monitored while you are connected to the network. It is not a guarantee of access to systems and services, but it is used to verify access or that a trust authority, such as the domain administrator, trusts you. As you know, it is not a guarantee that you have access to SQL Server. It is the means you have to assert your right to access.

Every access token holds the SID for a security principal, which is assigned to the access token after you log in successfully. The SID is a unique value of variable length that is used to identify a security principal to the security subsystem, and ultimately to a resource such as SQL Server. The security principal or user account is also known as a trustee to the Windows kernel. When referencing a user or a group for security purposes, Windows refers to the SID rather than the user or group name.

Every object has an associated ACL that stores the SID and the level of access a security principal might have on an object. Whenever you or the network administrator provides or denies access to an object, the SID’s corresponding access level in the object’s ACL is also updated. The access token-SID-ACL model is illustrated in Figure 5–4.

image from book
Figure 5–4: The access token-SID-ACL model

The permission list controls exactly what level of access you have to the file system object. SQL Server 2005 operates in a similar fashion; however, there are some differences we will discuss later in this chapter.

Note

Do not confuse the SID with the object identifier or OID. The OID is a means of enumerating objects and identifying objects with a globally unique identifier (GUID), thus guaranteeing their uniqueness.

SIDs guarantee that an account-a security principal-and all its associated rights and permissions are unique. If you delete an account and then recreate it under the same name, you will find that all rights and permissions of the deleted account are removed. The old SID gets deleted with the original account. In this respect, if you delete an account, the login created in SQL Server will have to similarly be deleted because it will hold data related to the old SID. The old login will not work if you recreate the user account in Active Directory because the SIDs will be different.

When you create an account, the system also creates the SID and stores it in the security structures of Active Directory. The first part of the SID identifies the domain in which the SID was created. The second part is called the relative ID or RID and refers to the actual object created. The RID is thus relative to the domain.

When a user logs in to the computer or domain, the SID is retrieved from the Security Accounts Manager (SAM) database or Active Directory and placed in the user’s access token. From the moment of login, the SID is used in the access token, a form of user impersonation, to identify the user in all security-related actions and interactions.

Windows Server 2003, and services such as SQL Server and Exchange, use the SID for the following purposes:

To identify the object’s owner
To identify the object owner’s group
To identify the account in access-related activity

Special, well-known SIDs get created by the system during installation to identify builtin users and groups, such as the Administrator account and the Administrators security group. When a user logs in to the system as guest, the access token for that user will include the well-known SID for the guest group, which will restrict the user from doing damage or accessing objects that a user is not entitled to.

Trusted Access

Once a security principal, or an imposter, has gained access to a system, it becomes very difficult to get them out short of shutting down the system, denying access, or terminating their network connection. Authentication is thus a very critical area for operating system manufacturers, and they rightly invest a lot of money building software to adequately validate authentic users as quickly as possible, and to block access to villains.

No matter how clever are the authentication schemes used, if a person is able to obtain a valid user ID and password, and that person obtains an unfettered connection to the network login point, he or she will gain access to your system. (Biometrics prevents this, but more about that later.) It is then up to the security mechanisms on the network, access control, to keep the “perp” in check. Tools like auditing software and alerts can help, but often they raise the alarms too late.

The answer lies with Kerberos. Microsoft made a considerable investment, for Windows Server 2003, in the Kerberos protocol, which is an open standard that works on a system of tickets, based on the concept of shared secrets, and implemented using secret key cryptography. Sounds like spy novel stuff for us DBAs, but it is essential in the new world of digital nervous systems and distributed computing and data services. Here is a brief introduction to shared secrets.

Shared Secrets

The Kerberos protocol operates on the concept of a shared secret. The term “shared secret” refers to a secret known by a few entities (two or more). Take a gang of bank robbers about to rob a bank. You know the drill. One of the robbers is disguised as a guard, and all the other robbers are hiding in plain sight, as the bank’s customers. They are all sharing a secret; they know the guard is not for real and they are about to hold up the tellers. No one else knows this. In other words the secret is known only by the people about to rob the bank; the tellers have no idea their day is about to be ruined. But if all the gang members are disguised so well that they look like regular customers, how do they tell each other apart from the real customers? In this regard, each member of the gang can identify another member by confirming that the other person knows the secret. I can’t tell you what the secret is, because I am not a member of the gang…it could be a triple wink means “we have three seconds to get out” or something. Likewise, the shared secret in Kerberos is between Kerberos and the security principal (the human user or a device). The protocol guarantees that no other entity is able to learn the secret.

Here’s another analogy from the Digital Age: Two people engage in a steamy cyberspace love affair and engage each other in a private chat room every night. They need to be sure that their communications can be trusted, that they are truly from the other party, and not a spy or an investigator masquerading as the other party, lurking in the chat room. So in order to be sure that the partners are genuine, they both agree, offline, that something in the messages between them will confirm that the message can be trusted.

However if someone is analyzing the correspondence and looking for repeating word arrangements, it will not take them long to discover the secret key. For example, you cannot comment on the weather, or grandma’s cat, indefinitely before someone else catches on. On a network, using inspection software, it would take seconds to intercept a message and fool an authentication service into thinking that it is engaged in trusted communications.

So how do the two cyber-lovers devise a plan to be certain that their engagement is secure? The answer is to use secret key cryptography and encrypt the communications. They would both hold keys that can decrypt their messages. And the spy would not be able to read their steamy words. The key itself must naturally be kept secret.

For the plan to work, the secret key must be symmetric. This means that it must be capable of both encryption and decryption. In other words, as long as the two correspondents share a private secret key, they can encrypt their love letters at will and be sure that the other partner, and only the other partner, is able to decrypt it.

The practice of secret key cryptography is not new but goes back to before the Cold War days when insurgents perfected secret key techniques and cipher science. In fact, it is said that had not the allies seized the Enigma key from the Nazis and cracked the secret communications about their war plans, the swastika would be flying over a Germany that spans the world.

The implementation of Kerberos in Windows Server 2003 is fully compliant with the Internet Engineering Task Force’s (IETF) Kerberos v5. This standard, which was originally developed by MIT, is supported by many operating systems, which means that the authentication services in the Windows Server 2003 domain can be trusted in other Kerberos networks (known as realms), such as networks running Mac OS, Novell NetWare, UNIX, AIX, IRIX, and so forth.

I have written a lot about the Kerberos protocol in Windows Server 2003; and I consider it largely beyond the scope of this book to explore it in actual deployment detail. However, there is one exception that has to do with the Single Sign-on Initiative (delegation) and the Kerberos ticket granting service; I will return to that point later in this chapter.

At this juncture, you need to grasp that the widespread security philosophy used in most IT environments is that once a security principal has gained access to the network or a system, it becomes a trusted principal. This means that it is granted a certain level of trusted access.

But trusted access, like the levels of security earlier discussed, can be controlled. If you manage a system that is very difficult to break into, or that employs highly sophisticated authentication technology, such as scanning the retina in your eye or the rings of your teeth, would it make sense to control access in the same fashion at every door within the enterprise? It all depends on the secrets you are holding and the respect you have for the information and data that belong to the enterprise and its shareholders.

So now we have another security pyramid, illustrated in Figure 5–5, that we can use to illustrate the issue. At the pinnacle is the highest level of trusted access, requiring the system to demand the utmost in trust from the user. And at the base is the lowest level of trusted access. At the pinnacle, you will use all the security tools in the book to make sure that trusted access stays trusted, not compromised.

image from book
Figure 5–5: The base of the pyramid

Some systems, however, need facilities to allow even nontrusted access. SQL Server 2005 is one such system, and this brings us to an investigation of the trusted-nontrusted access mechanisms it deploys.