Recipe 12.28. Compressing and Decompressing Your FilesProblemYou need a way to compress the data you write to a file using one of the stream-based classes. In addition, you need a way to decompress the data from this compressed file when you read it back in. SolutionUse the System.IO.Compression.DeflateStream or the System.IO.Compression. GZipStream classes to read and write compressed data to a file. The CompressFile, DeCompressFile, and DeCompress methods shown in Example 12-16 demonstrate how to use these classes to compress and expand data on the fly. Example 12-16. The CompressFile, DeCompressFile, and DeCompress methods
The CompressionType enumeration is defined as follows: public enum CompressionType { Deflate, GZip } DiscussionThe CompressFile method accepts a Stream object, data in the form of a byte array, and a CompressionType enumeration value indicating which type of compression algorithm to use (Deflate or GZip). This method produces a file containing the compressed data. The DeCompressFile method accepts a Stream object and a CompressionType enumeration value indicating which type of decompression algorithm to use (Deflate or GZip). This method calls the Decompress method, which reads from a compressed file and places the data, uncompressed and in the form of bytes, into a generic List<byte> collection object. This collection object is then converted to a byte[] and returned with the data to the calling method. The TestCompressNewFile method shown in Example 12-17 exercises the CompressFile and DeCompressFile methods defined in the Solution section of this recipe. It also uses another method, NormalFile (shown first), that creates an uncompressed file to show how the file sizes differ. Example 12-17. Using the CompressFile and DecompressFile methods
When this test code is run, we get three files with different sizes. The first file, NewNormalFile.txt, is 10,000,000 bytes in size. The NewCompressedFile.txt file is 85,095 bytes. The final file, NewGzCompressedFile.txt file is 85,113 bytes. As you can see, there is not much difference between the sizes for the files compressed with the DeflateStream class and the GZipStream class. The reason for this is that both compression classes use the same compression/decompression algorithm (i.e., the lossless Deflate algorithm as described in the RFC 1951: Deflate 1.3 specification). You may be wondering why you would pick one class over the other if they use the same algorithm. There is one good reason; the GZipStream class adds a CRC check to the file to determine if it has been corrupted. If the file has been corrupted, an InvalidDataException is thrown with the statement "The CRC in GZip footer does not match the CRC calculated from the decompressed data." By catching this exception, you can determine if your data is corrupted. See AlsoSee the "DeflateStream Class" and "GZipStream" topics in the MSDN documentation. |