5.5. Compress and Decompress Data
Even with the ever-increasing capacity of hard drives and the falling price of computer memory, it still pays to save space. In .NET 2.0, a new System.IO.Compression namespace makes it easy for a VB 2005 programmer to compress data as she writes it to a stream, and decompress data as she reads it from a stream.
Note: Need to save space before you store data in a file or database? . NET 2.0 makes compression and decompression easy.
5.5.1. How do I do that?
The new System.IO.Compression namespace introduces two new stream classes: GZipStream and DeflateStream, which, as you'd guess, are used to compress and decompress streams of data.
The algorithms used by these classes are lossless, which means that when you compress and decompress your data, you won't lose any information.
To use compression, you need to understand that a compression stream wraps another stream. For example, if you want to write some compressed data to a file, you first create a FileStream for the file. Then, you wrap the FileStream with the GZipStream or DeflateStream. Here's how it works:
Dim fsWrite As New FileStream(FileName, FileMode.Create) Dim CompressStream As New GZipStream(fsWrite, CompressionMode.Compress)
Now, if you want to write data to the file, you use the GZipStream. The GZipStream compresses that data, and then writes the compressed data to the wrapped FileStream, which then writes it to the underlying file. If you skip this process and write directly to the FileStream, you'll end up writing uncompressed data instead.
Like all streams, the GZipStream only allows you to write raw bytes. If you want to write strings or other data types, you need to create a StreamWriter. The StreamWriter accepts basic .NET data types (like strings and integers) and converts them to bytes. Here's an example:
Dim Writer As New StreamWriter(CompressStream) ' Put a compressed line of text into the file. Writer.Write("This is some text")
Finally, once you're finished, make sure you flush the GZipStream so that all the data ends up in the file:
Writer.Flush( ) CompressStream.Flush( ) fsWrite.Close( )
The process of decompression works in a similar way. In this case, you create a FileStream for the file you want to read, and then create a GZipStream that decompresses the data. You then read the data using the GZipStream, as shown here:
fsRead = New FileStream(FileName, FileMode.Open) Dim DecompressStream As New GZipStream(fsRead, CompressionMode.Decompress)
Example 5-5 shows an end-to-end example that writes some compressed data to a file, displays the amount of space saved, and then decompresses the data.
Example 5-5. Compress and decompress a sample file
Imports System.IO Module FileCompression Public Sub Main( ) ' Read original file. Dim SourceFile As String SourceFile = My.Computer.FileSystem.CurrentDirectory & "\test.txt" Dim fsRead As New FileStream(SourceFile, FileMode.Open) Dim FileBytes(fsRead.Length - 1) As Byte fsRead.Read(FileBytes, 0, FileBytes.Length) fsRead.Close( ) ' Write to a new compressed file. Dim TargetFile As String TargetFile = My.Computer.FileSystem.CurrentDirectory & "\test.bin" Dim fsWrite As New FileStream(TargetFile, FileMode.Create) Dim CompressStream As New GZipStream(fsWrite, CompressionMode.Compress) CompressStream.Write(FileBytes, 0, FileBytes.Length) CompressStream.Flush( ) CompressStream.Close( ) fsWrite.Close( ) Console.WriteLine("File compressed from " & _ New FileInfo(SourceFile).Length & " bytes to " & _ New FileInfo(TargetFile).Length & " bytes.") Console.WriteLine("Press Enter to decompress.") Console.ReadLine( ) fsRead = New FileStream(TargetFile, FileMode.Open) Dim DecompressStream As New GZipStream(fsRead, CompressionMode.Decompress) Dim Reader As New StreamReader(CType(DecompressStream, Stream)) Console.WriteLine(Reader.ReadToEnd( )) Reader.Close( ) fsRead.Close( ) End Sub End Module
5.5.2. What about...
...unzipping .zip files? Unfortunately, the .NET 2.0 compression streams can't deal with ZIP files, file archives that are commonly used to shrink batches of files (often before storing them for the long term or attaching them to an email message). If you need this specific ability, you'll probably be interested in the freely downloadable #ziplib (available at http://www.icsharpcode.net/OpenSource/SharpZipLib).
5.5.3. Where can I learn more?
For more information about the GZipStream and DeflateStream algorithms, look them up in the MSDN Help. You can also look up the "compression" index entry for a Windows application example that uses these classes.