Understanding Encryption Techniques and Technologies | Microsoft Visual C++ .NET 2003 Kick Start

Encryption has long been considered an advanced topic, but the .NET Framework makes it remarkably simple to encrypt and decrypt data within your applications. This is especially important for mobile code, which can rely on passwords or other information that is stored within your code or within a configuration file. You might also want to prevent hand-editing of documents saved by your application; this is especially true of applications used for financial purposes. Imagine if the tax collectors gave you an application to calculate your own taxes, and you could lower your tax owing with Notepad!

Although it's simple to use cryptography (the science of ciphers, codes, encryption, and decryption) in a .NET application, it's important to understand some of the principles before you begin. Some techniques can be overkill for your application; others can be useless if you don't use them in the correct combination.

Ciphers, Plaintext, and Ciphertext

Before you start adding encryption to your application, there are a few words you need to know:

A code replaces words with other words or with numbers ; modern cryptography doesn't use codes much any more
A cipher replaces a number with another number; because any piece of text can be represented as numbers (bytes), most modern cryptography techniques use ciphers
Plaintext is a string written in its original, plain form: a password or secret message
Ciphertext is the transformed, encrypted version of plaintext

Encryption is the process of transforming plaintext into ciphertext. If a "bad guy" gets the ciphertext, it can't be used against you. Of course, if the "bad guy" is able to transform the ciphertext back into plaintext, whatever it was you wanted to prevent is probably going to happen after all. Preventing decryption by interceptors is a big part of why encryption is used in applications, but it's not the only reason for encryption and decryption.

Confidentiality, Integrity, and Authentication

There are three desires that are generally satisfied with the use of some form of cryptography: confidentiality, integrity, and authentication. To explain these terms, consider a conversation between two entities that is taking place in public: It could be a person ordering a book over the Internet, or two people passing notes in a high school classroom.

Confidentiality means that no eavesdropper can learn the sender's secret information.
Integrity means that the sender's information is not changed between leaving the sender and reaching the destination.
Authentication means that the destination is absolutely sure the message is from the sender.

These same three concepts apply to information that is not passed as a message in the usual sense; for example if a user stores information on the hard drive, confidentiality addresses that user's worry that someone else would be able to read the stored information. Integrity addresses the program developer's worry that a user could hand-edit the stored information without the checks and balances the application provides. Authentication addresses the issue of whether or not the user is indeed authorized to use the application, or to read or write files.

Symmetric and Asymmetric Cryptography

Cryptography is generally split into two branches: symmetric cryptography and asymmetric cryptography. In symmetric cryptography there is one key, and algorithms exist to encrypt or decrypt information using that key. The correspondents must agree on a shared secret key in some nonpublic manner, such as a private meeting, a disk in a courier envelope, or a telephone conversation. In asymmetric cryptography, each user has a two-part key; a public part known to everyone (including eavesdroppers) and a private part known to no one else (not even the destination of the message). These two keys are the opposite of each other: If the sender applies a private key to encrypt something, anyone can apply that sender's public key to decrypt it. If someone intends a message for a particular recipient, it can be encrypted with the recipient's public key; the recipient just applies the private key to decrypt it.

Consider the requirement for confidentiality. If two people have agreed on a symmetric key and encrypt everything they send, they can be sure that the key prevents outsiders from eavesdropping. Of course, how can you agree on a secret key with an individual (or entity, such as an online store) with whom you have no nonpublic interaction? Confidentiality can also be assured using asymmetric cryptography. Encrypting information with your own private key doesn't protect confidentiality, because anyone can decrypt them with your public one. But if the sender encrypts using the recipient's public key, confidentiality is assured: Only the recipient knows the private key.

Integrity can also be protected with both methods , and both rely on a related technology known as hashing . A hash takes a long and complex string or series of numbers and reduces it to a single number. A good hash does so in a way that comes close to guaranteeing that two different strings or series of numbers will never hash down to the same result. A cryptographically useful hash is also very difficult to invert: If you obtain the hash result, you can't use it to determine the original series of numbers.

How does a hash protect integrity? The sender sends not only the message, but a hash of the message as well. Any change to the message would mean that the hash value sent along would no longer match the hash from the new message. But a hash alone can't ensure integrity if the "bad guy" who intercepts and changes the message can calculate a hash result for the new message, and send a new one along with the new message. If the sender and recipient are using symmetric cryptography, the sender encrypts the message and its hash. The recipient decrypts them, determines what the hash should be, and compares it to the hash that was sent. Because the interceptor can't read the combined message, a replacement message can't be slipped in as before; although the interceptor might be able to change some bits in the message, the recipient will know right away that the message received is not what was sent. The same logic applies with asymmetric cryptography, except that the sender encrypts everything with the recipient's public key, knowing that only the recipient can decrypt it with the matching private key.

What about authentication? In symmetric cryptography, if you receive a message that makes sense when it is decrypted with the shared secret key, you are confident that it came from the only other person who knows that shared secret key. But that's not the case with the asymmetric approaches described so far. The sender encrypts the message with the recipient's public key, but everyone knows the recipient's public key; that means anyone can write a message to the recipient and claim it's from some other sender. Proving it's from a particular sender involves the sender's private key at some point.

You can include some simple text, such as the sender's name, encrypted with the sender's private key. The recipient would apply the sender's public key, and if the result is the sender's name, the message must have come from the sender. The problem with this is that the recipient can save the ciphertext (the sender's name encrypted with the sender's private key) and paste it into some future message, successfully posing as the sender. The text to be encrypted with the sender's private key has to be different for every message, so that it can't ever be re-used. The perfect candidate for this is the hash of the message. It's been calculated anyway, and it will be different for every message.

To summarize, in order for a symmetric cryptography user to create a message with confidentiality, integrity, and authentication, the sender first constructs a message, and then computes a hash. The sender then glues together the message and the hash, and encrypts the whole thing with the secret shared key. The recipient decrypts with the secret shared key and recomputes the hash for the message as received. If the hashes match, the recipient knows that this message is from the sender, that it has not been changed in transit, and that no one has learned anything from it in transit.

In order for an asymmetric cryptography user to create a message that will have confidentiality, integrity, and authentication, the sender must construct a message and compute a hash. The sender glues together the message, the hash, and the hash encrypted with the sender's private key, and then encrypts the whole thing with the recipient's public key. The recipient decrypts everything with the private key, and extracts the message. The recipient recomputes the hash for the message as received. As with symmetric cryptography, if the hashes match, the message has not been changed. The recipient goes on to decrypt the third part with the sender's public key: if the result matches the hash then the message is definitely from the sender. The encrypted hash has no way to pose as the sender later because no other message will have that same hash value.

Cryptography in the .NET Framework

In many ways, security is an ongoing battle between those who want to keep information secret, and those who want to learn other people's secrets. Some encryption algorithms have been cracked in shockingly short amounts of time. Perhaps no encryption algorithm will stand forever. That presents an interesting challenge to the designers of a package of classes to support encryption and decryption.

The classes that are used to implement cryptography within your applications are in the System.Web.Cryptography namespace. This contains two important classes: SymmetricAlgorithm and AsymmetricAlgorithm . These abstract classes are the base classes for classes named for the various encryption algorithms that ship with the .NET Framework. In version 1.1, these are

DES, RC2, Rijndael, and TripleDES: symmetric algorithms
DSA and RSA: asymmetric algorithms

The twist is that the class called System.Web.Cryptography . TripleDES does not implement the TripleDES algorithm. It's a base class for a class called TripleDESCryptoServiceProvider , which actually provides the implementation. This allows several implementations of a particular algorithm to co-exist within the library, while exposing their common structure and properties to your code.

To keep your code as robust as possible, avoid using the names of the implementation classes. Each of the base classes has a Create() method that returns a new instance of the current default implementation class. For example, this line creates an instance of the RijndaelManaged class without naming it:

 Rijndael* r = Rijndael::Create();

CLASS NAMES

The names of the implementation classes that ship with the .NET Framework are built from the name of the algorithm and a clue about the way the class is implemented. Classes with names that end with CryptoServiceProvider use the CryptoAPI library that ships with Windows. Classes with names that end with Managed are written in managed code, usually C#.

A using namespace statement would be needed for this line to compile.

When you use the Create() method, you are making it simpler to drop in new and better implementations of an algorithm in the future. Install the new class into the library and make it the default implementation (or more realistically , upgrade to some hypothetical new version of the library that contains new classes and defines some of them to be the new default implementations for selected algorithms). Your code will then use the new implementations without any code changes, and without even a recompile.