Adapting Other Security Concepts to Storage


Truth be told, security has not changed much since the Middle Ages when kings erected walls and dug moats around their castles and used book codes and other techniques to encrypt messages to their peers and field generals. In the modern LAN, technologies such as firewalls and Virtual Private Networks (VPNs) provide the walls and moats, and key encryption provides the codes. Organizations deploy these technologies to secure data assets, first, by protecting the perimeter of the network and, second, by preventing the misuse or disclosure of data if perimeter defenses fail.

Another data protection technique associated with data protection in medieval times was the signet ring. A message written and encoded was closed with a drop of candle wax into which was impressed the image of the face of the royal signet. This stamp authenticated the contents of the message providing nonrepudiation.

Nonrepudiability is also important in contemporary data storage, particularly to comply with requirements established by HIPAA and other laws. Recently, a large medical services company with tens of terabytes of digital records tried an experiment. It copied a subset of its electronic records, which included MRI scans and X-rays, from expensive disk arrays to tape, then restored the files from tape back to disk.

Unfortunately, a lot of bits were lost in the transfer. The company discovered that a 10 percent variance existed between the restored data and the original data.

The difference was disconcerting. While a small amount of data loss might not represent a serious problem for certain types of applications, it is different with medical records, which might be used later as a reference for ongoing treatment or as evidence in a malpractice court case. Even a slight change in data ”one that could manifest itself, say, as a shadow on a lung ”could have significant repercussions for doctors and patients .

The issue of data integrity and nonrepudiation is a growing problem for data storage managers. Several efforts are afoot within standards-making bodies, such as the Internet Engineering Task Force (IETF) and the World Wide Web Consortium (W3C), to come up with a fix. At the same time, work is proceeding behind the scenes within many storage vendor shops to develop proprietary solutions. Just keeping up with the terminology can be a challenge.

For example, many of the efforts involve using Message Digest Version 5 (MD5). Message Digest Version 5 (MD5) is an algorithm submitted to the IETF in 1992 to provide a means for authenticating data by creating a 128-bit fingerprint or message digest from a much longer string of bits. The algorithm, or hash, was intended to authenticate the contents of a dataset, since each fingerprint created with the hash would be different from all others.

While similar in concept to the checksum commonly used in storage, MD5 is described by some as "human proof." [15] While it is possible to change data and still arrive at the same checksum, you cannot change the data and still produce the same MD5 hash.

MD5 hashing, in addition to its applications for data integrity testing, could also be used to support a scheme of "content networking" or hierarchical storage management. Just as addressing is used to map contents to a memory location or disk sector in day-to-day computing, content addressing and HSM take a similar approach to data stored across a large storage platform or network. An effective content addressing scheme allows you to retrieve an entire object by looking up a portion of its contents, while an HSM system typically locates data that has been migrated to near-line or off-line storage using a reference stub.

Since MD5 hashing takes a large amount of data and condenses it to a small message header, experts say that it might become the basis for content networking in a next -generation storage system. The inclusion of an MD5 hash in a data naming scheme such as the one described earlier in this book could provide both a nonrepudiation function for the named data and a mechanism for tracking the data through its useful life.



The Holy Grail of Network Storage Management
The Holy Grail of Network Storage Management
ISBN: 0130284165
EAN: 2147483647
Year: 2003
Pages: 96

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net