The zlib Module

(Optional) The zlib module provides support for "zlib" compression. (This compression method is also known as "deflate.")

Example 2-43 shows how the compress and decompress functions take string arguments.

Example 2-43. Using the zlib Module to Compress a String

File: zlib-example-1.py

import zlib

MESSAGE = "life of brian"

compressed_message = zlib.compress(MESSAGE)
decompressed_message = zlib.decompress(compressed_message)

print "original:", repr(MESSAGE)
print "compressed message:", repr(compressed_message)
print "decompressed message:", repr(decompressed_message)

original: 'life of brian'
compressed message: 'x234313311LKU310OSH*312L3140300!1004302'
decompressed message: 'life of brian'

The compression rate varies a lot, depending on the contents of the file, as you can see in Example 2-44.

Example 2-44. Using the zlib Module to Compress a Group of Files

File: zlib-example-2.py

import zlib
import glob

for file in glob.glob("samples/*"):

 indata = open(file, "rb").read()
 outdata = zlib.compress(indata, zlib.Z_BEST_COMPRESSION)

 print file, len(indata), "=>", len(outdata),
 print "%d%%" % (len(outdata) * 100 / len(indata))

samplessample.au 1676 => 1109 66%
samplessample.gz 42 => 51 121%
samplessample.htm 186 => 135 72%
samplessample.ini 246 => 190 77%
samplessample.jpg 4762 => 4632 97%
samplessample.msg 450 => 275 61%
samplessample.sgm 430 => 321 74%
samplessample.tar 10240 => 125 1%
samplessample.tgz 155 => 159 102%
samplessample.txt 302 => 220 72%
samplessample.wav 13260 => 10992 82%

You can also compress or decompress data on the fly, which Example 2-45 demonstrates.

Example 2-45. Using the zlib Module to Decompress Streams

File: zlib-example-3.py

import zlib

encoder = zlib.compressobj()

data = encoder.compress("life")
data = data + encoder.compress(" of ")
data = data + encoder.compress("brian")
data = data + encoder.flush()

print repr(data)
print repr(zlib.decompress(data))

'x234313311LKU310OSH*312L3140300!1004302'
'life of brian'

Example 2-46 shows how to make it a bit more convenient to read a compressed file, by wrapping a decoder object in a file-like wrapper.

Example 2-46. Emulating a File Object for Compressed Streams

File: zlib-example-4.py

import zlib
import string, StringIO

class ZipInputStream:

 def _ _init_ _(self, file):
 self.file = file
 self._ _rewind()

 def _ _rewind(self):
 self.zip = zlib.decompressobj()
 self.pos = 0 # position in zipped stream
 self.offset = 0 # position in unzipped stream
 self.data = ""

 def _ _fill(self, bytes):
 if self.zip:
 # read until we have enough bytes in the buffer
 while not bytes or len(self.data) < bytes:
 self.file.seek(self.pos)
 data = self.file.read(16384)
 if not data:
 self.data = self.data + self.zip.flush()
 self.zip = None # no more data
 break
 self.pos = self.pos + len(data)
 self.data = self.data + self.zip.decompress(data)

 def seek(self, offset, whence=0):
 if whence == 0:
 position = offset
 elif whence == 1:
 position = self.offset + offset
 else:
 raise IOError, "Illegal argument"
 if position < self.offset:
 raise IOError, "Cannot seek backwards"

 # skip forward, in 16k blocks
 while position > self.offset:
 if not self.read(min(position - self.offset, 16384)):
 break

 def tell(self):
 return self.offset

 def read(self, bytes = 0):
 self._ _fill(bytes)
 if bytes:
 data = self.data[:bytes]
 self.data = self.data[bytes:]
 else:
 data = self.data
 self.data = ""
 self.offset = self.offset + len(data)
 return data

 def readline(self):
 # make sure we have an entire line
 while self.zip and "
" not in self.data:
 self._ _fill(len(self.data) + 512)
 i = string.find(self.data, "
") + 1
 if i <= 0:
 return self.read()
 return self.read(i)

 def readlines(self):
 lines = []
 while 1:
 s = self.readline()
 if not s:
 break
 lines.append(s)
 return lines

#
# try it out

data = open("samples/sample.txt").read()
data = zlib.compress(data)

file = ZipInputStream(StringIO.StringIO(data))
for line in file.readlines():
 print line[:-1]

We will perhaps eventually be writing only small
modules which are identified by name as they are
used to build larger ones, so that devices like
indentation, rather than delimiters, might become
feasible for expressing local structure in the
source language.
 -- Donald E. Knuth, December 1974

Core Modules

More Standard Modules

Threads and Processes

Data Representation

File Formats

Mail and News Message Processing

Network Protocols

Internationalization

Multimedia Modules

Data Storage

Tools and Utilities

Platform-Specific Modules

Implementation Support Modules

Other Modules



Python Standard Library
Python Standard Library (Nutshell Handbooks) with
ISBN: 0596000960
EAN: 2147483647
Year: 2000
Pages: 252
Authors: Fredrik Lundh

Flylib.com © 2008-2020.
If you may any questions please contact us: flylib@qtcs.net