30.1 The Basics of Cryptography, Including Symmetric and Asymmetric Key Cryptography | HP-UX CSE(c) Official Study Guide and Desk Reference

A number of the tools and utilities that we look at in this module use encryption to keep private the data they are transmitting. Cryptography has been around since the Roman times; in fact, the first known cryptographic system was known as the Caesar cipher after Julius Caesar who used it to transmit orders to his generals without those orders being understood by his enemies when they captured the messages (and the accompanying messenger). Cryptography comes from the Greek crpyto- meaning hidden or secret and -graphy meaning writing; hence, cryptography is the art of secret writing . In today's world of electronic commerce, high finance, industrial espionage, and terrorism, the art of secret writing has become even more prevalent (see Figure 30-1). At the heart of all cryptographic schemes are two important components : the encryption salgorithm and the keys that lock and unlock the message. If we think of an analogy to describe the use of an encryption algorithm and an encryption key , a good analogy would be an old-fashioned safe. The combination lock is in effect the encryption algorithm ; you can study the mechanics of how the mechanism works, but without the combination itself (the encryption key ), the treasure held within remains elusive !

Figure 30-1. The art of secret writing.

As early as the fifth century BC, the Spartans used cryptographic devices to transmit military orders. Julius Caesar saw their potential early on and used them extensively. The famous Caesar-cipher that bears his name is simple yet effective. The encryption algorithm is a simple substitution algorithm. The encryption key is the clever part; without it, you don't know how letters are substituted. In the Caesar-cipher, it is a simple process of shifting the letters of the alphabet a certain number of places either up or down the alphabet. The example in Figure 30-1 shows a Caesar-cipher where the letters are shifted three places to the right, i.e., A D, B E, C F, and so on. As we can see (and this is still the case), even if we know the encryption algorithm , it is effectively useless without knowing the secret key . The ability to keep the secret key just that ( secret) was at the heart of code-breaking in Caesar's time as it was during the Second World War; recovering the code-book for the Enigma machine gave the allies a great insight into the transmissions used by the German military. Even though they had recovered an Enigma machine itself, it was useless without knowing the starting position of the internal rotors, i.e., the secret key(s) . These were documented in the much-prized codebook . An excellent book tracking the history and development of cryptography is The Code Book by Simon Singh. It is basically non-technical, but gives great insight into the past, present, and (possibly) future world of secret writing . At the end of this chapter, I mention Singh's book with some other, more technical books covering security and cryptography.

More and more applications are being written with the capability to encrypt and decrypt the information being dealt with. Several standards and products have emerged that have become common use in the world of cryptography. As far as trying to classify them in terms of the cryptographic techniques they employ , they fall into two main categories: private key (or symmetric key; we both use the same secret key) cryptography and public key (or asymmetric key; we use different keys) cryptography.

In an ideal cryptographic world, we would use private, symmetric key cryptography where we both use the same key to lock and unlock each communication. This poses a specific problem: How do we exchange secret keys in such a manner that only the sender and the receiver know what the keys are? A simple solution would be for you and me to meet down at the pub every Friday, have a few beers, agree on the secret key for that week, and then go back to the office (obviously, having a few more beers beforehand). The limitations of this are that the channel of communication is not entirely secure ( especially after even more beers) because we may be overheard or spied on in the pub. There is also the problem that if we don't regularly renew our secret keys, a spy could be working in the background trying every possible permutation of our secret key. The spy probably has an idea of the challenge he is facing in trying to guess our secret key because in today's climate of international communications and commerce, government export legislation dictates that the encryption algorithms we use are well known and publicly available. That's okay, because we know that just having the algorithm is not enough to crack an encrypted message; you need the key as well. The other limitation that certain government agencies impose is the size of the key we can use. The larger the key, the more possible permutations it can take and, hence, it's more difficult to break. In our case, the spy in the pub will need longer to guess and try each permutation of the possible keys we could be using. If the key is large enough, this may take hours, days, weeks, or even years . A classic example of this is a challenge posed by industry leaders RSA Data Security. In August 1977, an article in Scientific American by Martin Gardner explained public key cryptography and how RSA works. Gardner then posed a challenge whereby he published a 129-digit number (-10 ¹²⁹ ). This number was the product of two very large prime numbers (factoring large prime numbers is an important part of today's encryption tools). It took 17 years for 600 volunteers working in parallel to find the two factors. Today's encryption technologies use larger keys than Gardner used. It is estimated that even using a powerful PC or workstation, it would take more years than the universe has existed to break the code of current encryption technologies (encryption keys of the order of 10 ³⁰⁰ are not uncommon). If you think you are up to the challenge, the RSA-160 (160-bit number) was solved in 2003 using a combination of 32 R12000 and 72 Alpha EV67 workstations. The next challenge is to factor a 576-bit (RSA-576) number. If you are feeling up to it, then browse http://www.rsasecurity.com/rsalabs/challenges/factoring/index.html for more details.

Back to meeting down at the pub for a few beers.

We could continue with our regular meetings, have a few beers, and decide on our secret key. One obvious problem is that if I want to communicate with someone else, I will need to arrange another meeting to decide on a secret key for communications with that person. You can see where this scenario is leading. My company has many customers and I want to establish secure communications with all my customers. I will need to establish a secret key for each customer. This is going to take lots of meetings and lots of beer drinking! Then we have to consider that we need to renew our secret keys on a regular basis. While I will accrue vast frequent-flyer points and gain a voluminous knowledge of the world's watering holes, it becomes impractical to maintain secret keys in such a manner (my liver might have something to say about it as well). The problem of key distribution plagued the cryptography world for years. This is where public key or asymmetric key cryptography comes in. The idea here is to have two keys that are different (hence, asymmetric), but both can be used to unlock a message. The beauty of this is in the mathematics involved. Different public key encryption algorithms use different mathematical gymnastics. Many of them are based on modular arithmetic and/or factoring large primes. I won't bore you with the details, but here is a simple example (used in many of the books listed at the end of this chapter) of how a public key system works:

Alice and Bob want to communicate securely.
Alice and Bob decide to use a one-way encryption algorithm that uses large prime numbers.
Alice chooses two large prime numbers that we will call p _A and q _A .
Alice multiples p _A and q _A to give N _A .
Alice keeps p _A and q _A private; together they form Alice's private key (Alice _private ).
Alice publishes N _A ; this becomes Alice's public key (Alice _public ).
Bob does the same, giving p _B , q _B (Bob _private ), and N _B (Bob _public ).
When Bob wants to send a message to Alice, he uses Alice's public key N _A to encrypt his message, knowing that to decrypt the message, Alice will either have to know her private k ey or be able to factor two very large prime numbers, which as we saw earlier with the RSA challenge is not easy.
To read the message, Alice inputs the encrypted message into the encryption algorithm along with her private key. The result is the decrypted message.
If Eve intercepts the message, she would have to perform a key search of every possible prime number that is a factor of N _A . Using sufficiently large prime numbers makes this system effectively impregnable.
An important aspect of this design is how to store and have access to public keys. What we need is a trusted intermediary . This could be an organization on the Internet (Verisign is one such company) that will securely store and distribute our public keys. In order to store a public key with them, you will be required to jump through various legal hoops (using notaries, and so on) to prove that you are who you say you are. This is a good thing, because we don't want Eve to update or delete Alice's key. This service is known either as a Key Distribution Center (KDC) or a Certification Authority (CA); a certificate is a signed document with Alice's name and Alice's public key. The differences between a KDC and a CA are not important at this time. If you are interested, have a look at the Kaufman, Perlman, and Speciner book, page 188, listed at the end of this chapter.

This system affords us confidentiality, which is an extremely important goal of this system. With a slight tweak, we can also get it to provide more: authenticity, authentication, and non- repudiation .

The need for these additional benefits arises from the fact that Eve is a bit of a nasty character. Eve intercepts a message, realizes that it's encrypted, and decides to mess things up by randomly changing some characters before forwarding it to Alice. Because the encrypted message no longer reflects the original message sent by Bob, when Alice decrypts the message, it decrypts to complete garbage. Remarkably, if Eve was either really clever or just plain lucky, she may have changed just the right character turning a 100 invoice into a & pound ;1,000,000 invoice. What we need is some mechanism to provide authenticity, i.e., confidence that the message has not been tampered with. This is where a message digest comes in. A message digest uses a hashing algorithm that produces a unique number related to the message. It is the same concept as a checksum or a message integrity check (MIC). There's more exceedingly clever mathematical gymnastics here because a well-designed hashing algorithm ensures that only one message will produce a particular message digest . This means that we can safely publish which message digest hashing algorithm we are going to use, confident that if a message is tampered with, the hashing algorithm will produce a different message digest indicating to Alice that the message has been tampered with. When Bob comes to send a message to Alice, he can first calculate the message digest and then append it to the message before encrypting it and sending everything to Alice. When Alice receives the message and decrypts it using her private key, she can run the message through the hashing algorithm to calculate the message digest . If the result is the same as the message digest appended by Bob, we know the message has not been tampered with: we have achieved authenticity . The last two attributes of our system that we want to achieve are authentication and non-repudiation. Authentication is fairly easy to understand: How do we know the message we received was actually sent by Bob? If you think about it, Eve (the bitter and twisted one in this relationship) could create what looks like a valid invoice from Bob (she receives invoices from Bob as well, so she could use her invoice as a template), calculate a valid message digest, obtain Alice's public key, encrypt the message, and send it to Alice. How does Alice know it actually came from Bob? She doesn't! What we need is for Bob to sign the invoice. We are thinking of a digital signature that will identify the message as coming from Bob. A digital signature works under the same principle as a written signature; it identifies the person who sent the document. Unlike written signatures, digital signatures can't be forged. How can we be so sure? Taking the hashing algorithm we talked about earlier, it produces a unique message digest that uniquely identifies a message. We also saw the benefit of using large prime numbers and the difficulty in factoring them. What if we used a hashing algorithm that used large prime numbers as part of the hashing algorithm? We already have some large prime numbers at our disposal: Bob's public and private key. Making use of similar mathematical gymnastics, the hashing algorithm can produce a unique message digest based on Bob's private key . Bob has effectively signed the invoice. The only difference between a message digest and a digital signature is that a message digest can be produced by anyone (Eve's hatching a plot with that in mind) while only one person can produce a digital signature because it is based on an individual's private key. When Alice decrypts the message, she obtains Bob's public key and uses the hashing algorithm to produce a message digest. If this compares to Bob's signature, she knows that the message has not been tampered with ( authenticity ) and that it definitely came from Bob ( authentication ). As a consequence of Bob signing the invoice, we have also achieved non-repudiation. Non-repudiation is the act of not being able to deny that you sent a message. By signing the invoice, Bob cannot deny that he sent it because only his private key could have produced that particular digital signature .

Some systems encrypt the message first and then sign it. Whichever, the idea is the same. Figure 30-2 gives you an idea of what's happening.

Figure 30-2. Public key cryptography.

You may think that all this clever stuff has emerged only in the recent past with the boom in the Internet and electronic commerce. The history of public key cryptography goes back many years. Many people credit public key cryptography to three colleagues working out of Stanford University: Whitfield Diffie, Martin Hellman, and Ralph Merkle. You may have heard of a crypto-system called Diffie-Hellman, which is a system of establishing secret keys over an insecure medium, i.e., our pub scenario. They were certainly some of the earliest conquerors of the elusive prize of a one-way encryption algorithm that can use two separate but related keys to lock and unlock messages. Three separate individuals working out of MIT ”Ronald Rivest, Adi Shamir, and Leonard Adleman ”took the ideas of Diffie, Hellman, and Merkle and produced an encryption system known as RSA (named after their initials ), which has been credited as one of the best (and most widely used) implementations of public key cryptography. While these accolades adorn our cousins from across the pond, it has come to light that three Britons had cracked public key cryptography some time sooner. In 1969, James Ellis was looking into the problem of key distribution while working for the Government Communications Headquarters (GCHQ). By 1975, James Ellis, Clifford Cocks, and Malcolm Williamson had pretty much cracked public key cryptography. Being Government employees working in a top secret spy center meant that they had to sit in silence because all their work is classified as top secret . In typical British spirit, they took on a stiff-upper-lip attitude and watched while someone else took all the glory for what they did for Queen and Country . Both teams were working completely independently, but because of the secret nature of their work, the British team couldn't discuss their findings with their American colleagues. The American researchers were not limited by any Official Secrets Act and were free to publish and discuss their work openly. Only recently has the British Government declassified some of the team's work, allowing them to talk publicly about their work surrounding public key cryptography.

Current encryption technologies use clever mathematics tricks utilizing large encryption keys. However, there is a downside to using strong, complex algorithms and large encryption keys, and that's speed. Today's applications require that the use of encryption should not adversely impact their usability. We all want something for nothing, but it seldom works out that way. In order to minimize the impact of these additional requirements, we have seen the introduction of advances in processor design that include features specific to encryption. Advances in the speed and intelligence of computer hardware means that we can use cryptographic technologies to secure our day-to-day communications with friends , colleagues, and business partners . We should also remember that less scrupulous individuals and organizations could use the same advances in technology in the pursuit of breaking encryption schemes. Followers of the dark side have an arsenal of technological and mathematical tricks up their sleeves and are constantly looking for new ways to break existing cryptographic schemes. Most people view these rogues as the bad guys . Yes, there are deviants out there who are hell-bent on making our lives as uncomfortable and as miserable as possible. On the upside, their determined efforts ensure that we ( the good guys ) are working ever harder to stay one step ahead. This necessity to keep advancing the boundaries of cryptographic systems is motivated by something known as the Fundamental Tenet of Cryptography : If lots of smart people have failed to solve a problem, then it probably won't be solved (soon) . We could rephrase it to something like this: If none of the good OR the bad guys has broken the code already, then it's probably secure (for the time being). If anyone out there has solved the age-old problem of factoring huge prime numbers, there's a rosy and lucrative future ahead of you. The strength of our cryptographic system can be summarized as follows :

How secret are our secret keys? Are we using a secure communication channel to transmit these secrets, or can someone eavesdrop on our conversations?
Are there any backdoors into our system? Can you insert a magic number that always decrypts a message? If you know which pub we meet at, could you hijack one or both of us?
Is the encryption algorithm strong enough on its own, or can it be reversed ? Factoring large (1024-bit) prime numbers is fairly difficult, so the time it takes to reverse the algorithm could be better used at guessing the keys themselves .
Does the spy have any previous knowledge of our transmissions that could be used to assist in the breaking of the code? If the spy can encrypt such a message, the resulting cyphertext may give him insights into the possible keys used to produce such cyphertext (a known plain-text attack).
What is the likelihood of someone guessing our secret key quickly? A key search (using a powerful workstation) of all possible permutations of a 1024-bit key would take more years than the current lifetime of the universe to guess!

Back to our discussions involving Alice and Bob. We established a mechanism using public key cryptography that allowed Alice and Bob to communicate with the confidence that they were doing so in privacy and with the assurance that their messages were being received intact and the confidence of knowing who the sender and receiver are. When we consider the content of Alice and Bob's message, encrypting a large message using public key cryptography is a time-consuming business. Not only is the encryption of large volumes of data time-consuming, but so is the calculation of a digital signature, which is a cornerstone in building confidence in such a system. One solution is to not use asymmetric keys to encrypt and sign large communications. This is not acceptable. The solution goes back to our original requirement leading to you and me meeting down at the pub for a few beers and deciding on a secret key for the forthcoming week. We should remember that in an ideal cryptographic world, we would use private, symmetric keys. The key itself is not a large volume of data. Where we could use public key cryptography is in establishing a private key. Essentially , the content of the message sent between Alice and Bob is not the real message; the content of the message is simply a private key. This is commonly referred to as a session key that can be renewed whenever Alice and Bob feel it is necessary. An example of this is IPSec (more on that later), which uses a variant (it uses public keys as a means of authentication) of the Diffie-Hellman crypto-system to establish a session key, which is subsequently used to encrypt/decrypt IP traffic between two nodes. Some encryption products may talk about providing Perfect Forward Secrecy (PFS), which essentially means that we establish a new secret key for every exchange between Alice and Bob. This is computationally expensive, but it does mean that if a single secret key is discovered , only that one transaction might be compromised.

Finally, if we are going to implement crypto-systems like the ones discussed above, it is important that we have a means of producing good random numbers. When we say good random numbers, we mean that the numbers produced are statistically random and don't follow any discernable pattern. This may seem a little obtuse, but in early implementations of Netscape, some hackers found that the random number generator used in the browser was rather predictable. There is a whole science behind random numbers. Computers are rather predictable devices and need some help (via really clever algorithms) to produce a sequence of numbers that is truly random. Random numbers are commonly used in crypto-systems and form an important part in producing private and public keys that are difficult to guess. If our random number generator is more predictable, hackers can gain an insight as to how our keys were produced, which may give them a fast track into breaking our keys. HP has produced a product called the Strong Random Number Generator. It is free to download from http://software.hp.com Security and Manageability, but it is only available for HP-UX 11i.

We have looked at the basic ideas surrounding encryption/decryption as well as a little history lesson. Please refer to the books listed at the end of this chapter for more technical and historical detail. We now look at some tools that utilize public, private, or both types of encryption.