You want to compress and decompress file data.
Sample code folder: Chapter 16\Compression
Use Gzip stream compression and decompression, new in Version 2.0 of the .NET Framework.
Figure 16-5. Compressing and decompressing a string
Because the GZipStream class works on streams, it's easy to point it to file streams as data is read to or written from files. This lets the compression and decompression algorithms intercept the bytes as they move through the file streams.
The FileCompress() and FileDecompress() functions are found in the same Compress.vb module that contains the string compression and decompression functions presented in Recipe 16.6. These functions are similar in that they intercept streams to process bytes as they move through them. One important difference is the use of a 4,096-byte buffer to process the file-stream data in chunks, rather than loading the entire file contents into memory. This allows even the largest files to be efficiently processed a piece at a time.
Here are the two file compression and decompression functions:
Public Sub FileCompress(ByVal sourceFile As String, _ ByVal destinationFile As String) ' ----- Decompress a previously compressed string. ' First, create the input file stream. Dim sourceStream As New FileStream( _ sourceFile, FileMode.Open, FileAccess.Read) ' ----- Create the output file stream. Dim destinationStream As New FileStream( _ destinationFile, FileMode.Create, FileAccess.Write) ' ----- Bytes will be processed by a compression ' stream. Dim compressedStream As New GZipStream( _ destinationStream, CompressionMode.Compress, True) ' ----- Process bytes from one file into the other. Const BlockSize As Integer = 4096 Dim buffer(BlockSize) As Byte Dim bytesRead As Integer Do bytesRead = sourceStream.Read(buffer, 0, BlockSize) If (bytesRead = 0) Then Exit Do compressedStream.Write(buffer, 0, bytesRead) Loop ' ----- Close all the streams. sourceStream.Close( ) compressedStream.Close( ) destinationStream.Close( ) End Sub Public Sub FileDecompress(ByVal sourceFile As String, _ ByVal destinationFile As String) ' ----- Compress the entire contents of a file, and ' store the result in a new file. First, get ' the files as streams. Dim sourceStream As New FileStream( _ sourceFile, FileMode.Open, FileAccess.Read) Dim destinationStream As New FileStream( _ destinationFile, FileMode.Create, FileAccess.Write) ' ----- Bytes will be processed through a ' decompression stream. Dim decompressedStream As New GZipStream( _ sourceStream, CompressionMode.Decompress, True) ' ----- Process bytes from one file into the other. Const BlockSize As Integer = 4096 Dim buffer(BlockSize) As Byte Dim bytesRead As Integer Do bytesRead = decompressedStream.Read(buffer, _ 0, BlockSize) If (bytesRead = 0) Then Exit Do destinationStream.Write(buffer, 0, bytesRead) Loop ' ----- Close all the streams. sourceStream.Close( ) decompressedStream.Close( ) destinationStream.Close( ) End Sub
The entire Compress.vb module is listed in Recipe 16.10.
The following code demonstrates file compression and decompression by first filling a file with many repetitions of the same lines of text. Doubling the size of the file several times causes the number of bytes stored in File1 to grow to almost 88K.
FileCompress() is called to compress File1 into File2. Because of the highly redundant nature of the data in this example, the original 88K bytes of data compress down to less than 1K, as stored in File2. Finally, FileDecompress() is called to decompress File2 into File3. This file ends up being exactly the same size and containing exactly the same data as File1, verifying the compression and decompression action:
Dim result As New System.Text.StringBuilder Dim file1Text As String = _ "This is sample content for a text file to" & vbNewLine & _ "be compressed and decompressed. File1 and" & vbNewLine & _ "File3 should show this plain text. File2" & vbNewLine & _ "is compressed and will be indecipherable." & vbNewLine For counter As Integer = 1 To 9 file1Text &= file1Text Next counter Dim file2Text As String Dim file3Text As String Dim file1 As String = Application.StartupPath & "\File1.txt" Dim file2 As String = Application.StartupPath & "\File2.gzz" Dim file3 As String = Application.StartupPath & "\File3.txt" ' ----- Compress and decompress the content files. My.Computer.FileSystem.WriteAllText(file1, file1Text, False) FileCompress(file1, file2) FileDecompress(file2, file3) ' ----- Display the results. file2Text = My.Computer.FileSystem.ReadAllText(file2) file3Text = My.Computer.FileSystem.ReadAllText(file3) result.Append("File1 length (original): ") result.AppendLine(file1Text.Length) result.Append("File2 length (compressed): ") result.AppendLine(file2Text.Length) result.Append("File3 length (decompressed): ") result.AppendLine(file3Text.Length) MsgBox(result.ToString( ))
Figure 16-6 displays the size in bytes of each of the three files after the functions are called.
Recipe 16.10 includes the full source code for the Compress module.
Figure 16-6. Compressing and decompressing a file