4.1. Introduction to EncryptionLet's suppose that you carry your laptop home from work every day, bring it back to the office the following morning, and then tether it to your desktop with a locking cable protected by a combination lock. You know how important it is to remember the lock combination, don't you? If you ever forget it, your laptop will end up married to your desk until you pry it free by cutting the cable. Maybe you remember numbers easily, but I don't. It's hard enough for me to even remember my own telephone number, let alone the plethora of secret numbers in my lifemy Social Security number, bank account PIN, voice mail password, and anniversary (oops!). To make things easier, I have devised an ingenious method for remembering that lock combinationI have written down the code on a label and put that label on the lock itself! And now you must be wondering if you would ever be able to trust me with something secure! Like the rest of humanity, I have a brain that is part hard drive (disk) and part random-access memory (RAM), and numbers seem to go into RAM more often than not. After a period of usage, the numbers are conveniently aged out to make room for more (not unlike the System Global Area of an Oracle instance) and are forgotten. In computers, this process is expected and is built into the design. Database systems are designed to store information and make it accessible to users when asked. Historically, the assumption has been that users who demand access will already have been authenticated to establish that they are who they claim to be. The mere storage of sensitive information, therefore, has not been considered a potential security breach. That may have been true at one time, but not today, with intruders seemingly everywherethey may be curiosity seekers; they may be planning to sell account data to your competitors; or they may be seeking to disrupt your system for revenge. The attack might come from outside, via the Internet, or inside your organization. (Indeed, research shows that most hacking does come from within.) As countless security breaches have shown, sensitive data clearly needs to be protected from anyone not authorized to see that data. What options does Oracle provide for that protection? Pan back to my lock combinationit's 3451. Not being a complete idiot, I don't write that number on my lock. Instead, I have a secret number that I always remember6754, and using this number I modify the lock combination by adding the corresponding digits: 3 + 6 = 9 4 + 7 = 11 5 + 5 = 10 1 + 4 = 5 The resulting numbers are 9, 11, 10, and 5. In my scheme, I use only single-digit numbers, so I wrap the double-digit numbers around the number 10; hence, 10 becomes 0, 11 becomes 1, and so on. Using my secret key 6754, I have transformed the number 3451 into 9105. It's the latter number that I write on the combination lock, not the actual code. If I forget the combination, I will be able to read that number and use my magic number 6754 to reverse the logic I applied earlier so I can use the number 3451 to unlock the key. The number 9105 is for the whole world to see, but the thief still won't be able to unlock the combination unless he also knows the key, 6754. In this way, I have encrypted the number represented by my lock combination. The number 6754 is the key to the encryption process. This type of encryption I've performed here is known as symmetric encryption because the same key is used to encrypt and decrypt. (In contrast, with asymmetric encryption , described later in this chapter, there are two distinct keys: a public key and a private key.) The logic I described to encrypt the code is a very simplistic implementation of an encryption algorithm. 4.1.1. Encryption ComponentsLet's summarize what we have learned so far. An encryption system has several basic components, as shown in Figure 4-1.
Figure 4-1. Symmetric encryption componentsLet's assume that a thief intent on stealing my laptop is trying to open the lock. What does she need in order to succeed? First, she has to know the algorithm; let's assume here that that she knows it, perhaps because I boasted about my cleverness at work, or she read this book, or this algorithm is public knowledge. Second, she needs to learn the key. That is something I can protect. Even if the thief knows about the algorithm, I can still hide the key effectively. But as there are only 4 digits in the key, it takes only up to 104, or 10,000, attempts by the thief to guess the key. And because each attempt has an equal probability of getting it right or wrong, in theory, the thief has a 1 in 5,000 chance to guess the right key. Can she do it? In this case, the thief will have to manually turn the wheels of the combination lock 5,000 times. That's daunting, but theoretically possible. Suddenly, I don't feel so secure anymore. What are the ways that I can protect my lock combination?
The first option is impossible if I am using a publicly known algorithm. I could develop my own, but the time and effort may not be worth it. It might later be found out anyway, and changing an algorithm is a very difficult task. That rules out the third option, too, leaving the second option as the only viable one. 4.1.2. The Effects of Key LengthMy lock combination is the digital equivalent of sensitive data. If an intruder wants to crack the encrypted key, 10,000 iterations to guess the code is trivialhe'll be able to crack it in under a second. What if I use an alphanumeric key instead of an all-numeric one? That gives 36 possible values for each character of the key, so the intruder will have to guess up to 364, or 1,679,616, combinationsmore difficult than 10,000, but still not beyond reach. The key must be strengthened, or "hardened," by making it longer than 4 characters. Table 4-1 shows how the maximum number of guesses required increases with the increase in the key length. Therefore, the secret to hardening the key is to increase the length of the key.
Remember that computers think in terms of bits and bytes (i.e., binary numbers), not alphanumeric characters. The possible values of a key position are 0 and 1, so the 10-digit key needs only 210, or 1024, combinations, an extremely easy number to handle. Practically speaking, a key must be much longer. The length of a key is described in bits, so a key of 64 numbers is said to be of 64-bit. Table 4-2 shows the relationship between key length and number of guesses required for a binary type key.
The longer the key, the more difficult it is to crack the encryption. But longer keys also extend the elapsed time needed to do encryption and decryption, as the CPU has to do more work. In designing an encryption infrastructure, you may need to make a compromise between key size and reduced security. 4.1.3. Symmetric Encryption Versus Asymmetric EncryptionIn the earlier example, the same key is used to encrypt and decrypt. As I mentioned, this type of encryption is known as symmetric encryption . There is an inherent problem with this type of encryption: because the same key must be used to decrypt the data, the key must be made known to the recipient. The key, which is generally referred to as the secret key , has to be either known by the recipient before she receives the encrypted data (i.e., there needs to be a "knowledge-sharing agreement") or the key has to be sent as a part of the data transmission. For data at rest (on disk), the key will have to be stored as a part of the database in order for an application to decrypt it. There are obvious risks in this situation. A key that is being transmitted may be intercepted by an intruder, and a key that is stored in the database may be stolen. To address this problem, another type of encryption is often used, one in which the key used to encrypt is different from the one used to decrypt. Because the keys differ, this is known as asymmetric encryption . Because two keys are generateda public key and a private keyit is also known as public-key encryption. The public key, which is required for the encryption, is made known to the sender and, in fact, can be freely shared. The other key, the private key, is used only to decrypt the data encrypted by the public key and must be kept secret. Let's see how public-key encryption might work in real life. As shown in Figure 4-2, John (on the left) is expecting a message from Jane (on the right). Here are the steps in the encryption process:
Note carefully here that there is no exchange of decryption keys between the parties. The public key is sent to the sender, but because that is not what is needed to decrypt the value, it does not pose a threat from a potential key theft. However, you should be aware of the effect of spoofing or phishing here, which can render this process of data encryption insecure. Here is a scenario:
Scary? Of course. So, what's the solution? The solution is to somehow verify the authenticity of the public key and ascertain its source as the correct sender. This can be done using a fingerprint match . The topic is beyond the scope of this book, but essentially, when Jane encrypts with the public key, she checks the fingerprint of the key to make sure the key does indeed belong to John. (This discussion also highlights how the communication lines between the source and the destination must be highly secure.) The key used to encrypt is not the key used to decrypt, so how does the decryption process know the key used during the encryption process? Recall that both keys are generated at the same time by the receiver, which ensures that there is a mathematical relationship between them. One is simply the inverse of the other: whatever one does, the other simply undoes it. The decryption process can therefore decipher the value without knowing the encryption key. Because public and private keys are mathematically related, it is theoretically possible to guess the private key from the public key, although it is a rather laborious process that requires factoring an extremely large number. So, to reduce the risk of brute-force guessing, very high key lengths are used, typically 1,024-bit keys, instead of the 56-, 64-, 128-, or 256-bit keys used in symmetric encryption. Note that a 1,024-bit key is typical, not the norm. Keys of shorter lengths are also used. Oracle provides asymmetric encryption at two points:
Both of these functions require use of Oracle's Advanced Security Option (ASO) , an extra-cost option that is not provided by default. That tool simply enables asymmetric key encryption on those functions; it does not provide a simple ready-to-use interface that you can use to build a data-at-rest encryption solution. The only developer-oriented encryption tools freely available in Oracle provide for symmetric encryption. For this reason, I focus on symmetric encryption, not asymmetric encryption, in this chapter.
4.1.4. Encryption AlgorithmsThere are many widely used and commercially available encryption algorithms, but we'll focus here on the symmetric key algorithms supported by Oracle for use in PL/SQL applications. The DES and Triple DES algorithms are supported by both of Oracle's built-in encryption packages: DBMS_CRYPTO and DBMS_OBFUSCATION_TOOLKIT; only DBMS_CRYPTO, introduced in Oracle Database 10g Release 1, supports AES, however.
Later in this chapter, I'll show how you can use these algorithms by specifying options or selecting constants in Oracle's built-in packages. 4.1.5. Padding and ChainingWhen a piece of data is encrypted, it is not encrypted as a whole by the algorithm. It's usually broken into chunks of eight bytes each, and then each chunk is operated on independently. Of course, the length of the data may not be an exact multiple of eight; in such a case, the algorithm adds some characters to the last chunk to make it exactly eight bytes long. This process is known as padding . This padding also has to be done right so an attacker won't be able to figure out what was padded and then guess the key from there. To securely pad the values, you can use a pre-developed padding method, which is available in Oracle, known as Public Key Cryptography System #5 (PKCS#5). There are several other padding options that allow for padding with zeros and for no padding at all. Later in this chapter, I'll show how you can use padding by specifying options or selecting constants in Oracle's built-in packages. When data is divided into chunks, there needs to be a way to connect back together those chunks, a process known as chaining. The overall security of an encryption system depends upon how chunks are connected and encryptedindependently or in conjunction with the adjacent chunks. Oracle supports the following chaining methods:
Later in this chapter, I'll show how you can use these methods by specifying options or selecting constants in Oracle's built-in packages. |