Encryption Basics | Cryptographic Streams

In this section, we begin discussing cryptography. The packages, classes, and methods discussed in this and following sections are part of the Java Cryptography Extension (JCE). As a standard extension to Java, the JCE cryptography classes live in the javax package rather than the java package. Several third parties in other countries have published their own implementations of this API. In particular, the open source implementation from the Legion of the Bouncy Castle (http://www.bouncycastle.org/) is worth a look.

I frankly don't trust Sun not to have inserted backdoors into its software for the use of various governments. I recommend using the third-party libraries no matter where you are if you really care about your privacy.

There are many different kinds of codes and ciphers, both for digital and nondigital data. To be precise, a code encrypts data at word or higher levels. Ciphers encrypt data at the level of letters or, in the case of digital ciphers, bytes. Most ciphers replace each byte in the original, unencrypted data, called plaintext , with a different byte, thus producing encrypted data, called ciphertext . There are many different possible algorithms for determining how plaintext is transformed into ciphertext (encryption) and how the ciphertext is transformed back into plaintext (decryption).

12.4.1. Keys

All the algorithms discussed here, and included in the JCE, are key-based. A key is a sequence of bytes used to parameterize the cipher. The same algorithm encrypts the same plaintext differently when a different key is used. Decryption also requires a key. Good algorithms make it effectively impossible to decrypt ciphertext without knowing the right key.

One common attack on cryptosystems is an exhaustive search through all possible keys. As a result, one popular measure of algorithmic security is key length. Shorter keys (56 bits and less) are definitely breakable by brute force search with specialized equipment. Keys of 112 bits are considered to have the minimum key length required for reasonable security. However, remember that a reasonable key length is only a necessary condition for security. Long key length is far from a sufficient condition. Long keys do not protect a weak algorithm or implementation.

12.4.2. Secret Key Versus Public Key Algorithms

There are two primary kinds of ciphers: symmetric (secret key) ciphers and asymmetric (public key) ciphers. Symmetric ciphers such as AES use the same key to encrypt and decrypt the data. Symmetric ciphers rely on the secrecy of the key for security. Anybody who knows the key can both encrypt and decrypt data.

Asymmetric ciphers, also known as public key ciphers, use different keys for encryption and decryption. This makes the problem of key exchange relatively trivial. To allow people to send you encrypted messages, you simply send them your encryption (public) key. Even if the key is intercepted, this only allows the interceptor to send you encrypted messages. It does not allow them to decode encrypted messages intended for you. Furthermore, you can digitally sign messages by encrypting either a message or a hash code of the message with your private key, which may then be decrypted with your public key. Any message that can be successfully decrypted with your public key may be presumed to have come from you because only you could have encrypted it with your private key in the first place. (Of course, if someone steals your private key, all bets are off.) The most famous public key cipher is the RSA cipher, named after its inventors, Ronald L. Rivest, Adi Shamir, and Leonard M. Adleman. RSA has the particularly nice property that either key can be used for encryption or decryption.

12.4.3. Block Versus Stream Ciphers

Encryption algorithms may also be divided into block and stream ciphers. A block cipher always encrypts a fixed number of bytes with each pass. For example, DES encrypts eight-byte blocks. If the data you're encrypting is not an integral multiple of the block size, the data must be padded with extra bytes to round up to the block size. Stream ciphers, by contrast, act on each bit or byte individually in the order it appears in the stream; padding is not required.

Block ciphers can operate in a variety of modes that use various algorithms to determine how the result of the encryption of one block of data influences the encryption of subsequent blocks. This ensures that identical blocks of plaintext do not produce identical blocks of ciphertext, a weakness code breakers could exploit. To ensure that messages that start with the same plaintext (for example, many email messages or form letters) don't also start with the same ciphertext (also a weakness code breakers can exploit), these modes require a nonsecret initialization vector, generally of the same size as a block, in order to begin the encoding. Initialization vectors are not secret and are generally passed in the clear with the encrypted data.

12.4.4. Key Management

Storing keys securely is a difficult problem. If the key is stored in hardware like a smartcard, it can be stolen. If the key is stored in a file on a disk, the disk can be stolen. Many basic PC protection schemes are based on OS- or driver-level operations that refuse to mount the disk without the proper password, but simply using a new OS (or driver or custom hardware) allows the key or unencrypted data to be read off the disk.

Ideally, keys should not be stored anywhere except in a human being's memory. Human beings, however, have a hard time remembering arbitrary 56-bit keys such as 0x78A53666090BCC, much less more secure 64-, 112-, or 128-bit keys. Therefore, keys humans have to remember are generally stored as a string of text called a password. Even then, the password is vulnerable to a rubber hose attack. Truly secure systems like those used to protect bank vaults require separate passwords remembered by two or more individuals.

A text password is converted into the raw bits of the key according to some well-known, generally public, hash algorithm. The simplest such algorithm is to use the bytes of the password as the key, but this weakens the security because the bits are somewhat predictable. For instance, the bits 01110001 (q) are very likely to be followed by the bits 01110101 (u). The bits 01111111 (the nonprinting delete character) are unlikely to appear at all. Because of the less than random nature of text, passwords must be longer than the corresponding keys.

To make matters worse, humans like passwords that are common words or phrases, like "secret," "password," or "sex." Therefore, one of the most common attacks on password-based systems is to attempt decryption with every word in a dictionary. To make these sorts of attacks harder, passwords are commonly "salted": combined with a random number that's also stored in the ciphertext. Salting can increase the space that a dictionary-based attack must search by several orders of magnitude.

Humans also write passwords down, especially when they need to store many different passwords for different networks, computers, and web sites. These written passwords can then be stolen. The java.security.KeyStore class is a simple, password-protected digital lockbox for keys of all sorts. Keys can be stored in the key store, and only the password for the key store needs to be remembered.

This discussion has been necessarily brief. A lot of interesting details have been skimmed over or omitted entirely. For the more complete story, see the Crypt Cabal's Cryptography FAQ at http://www.faqs.org/faqs/cryptography-faq/ or Java Security by Scott Oaks (O'Reilly).

Basic I/O

Introducing I/O

Output Streams

Input Streams

Data Sources

File Streams

Network Streams

Filter Streams

Print Streams

Data Streams

Streams in Memory

Compressing Streams

JAR Archives

Cryptographic Streams

Object Serialization

New I/O

Buffers

Channels

Nonblocking I/O

The File System

Working with Files

File Dialogs and Choosers

Text

Character Sets and Unicode

Readers and Writers

Formatted I/O with java.text

Devices

The Java Communications API

USB

The J2ME Generic Connection Framework

Bluetooth

Character Sets

Character Sets