Performing Random Access on Read-Once Input Streams

Problem

You have an IO object, probably a socket, that doesn't support random-access methods like seek, pos=, and rewind. You want to treat this object like a file on disk, where you can jump around and reread parts of the file.

Solution

The simplest solution is to read the entire contents of the socket (or as much as you're going to need) and put it into a StringIO object. You can then treat the StringIO object exactly like a file:

	require 'socket'
	require 'stringio'
	
	sock = TCPSocket.open("www.example.com", 80)
	sock.write("GET /
")
	
	file = StringIO.new(sock.read)
	file.read(10) # => "

" " this web page "

Discussion

A socket is supposed to work just like a file, but sometimes the illusion breaks down. Since the data is coming from another computer over which you have no control, you can't just go back and reread data you've already read. That data has already been sent over the pipe, and the server doesn't care if you lost it or need to process it again.

If you have enough memory to read the entire contents of a socket, it's easy to put the results into a form that more closely simulates a file on disk. But you might not want to read the entire socket, or the socket may be one that keeps sending data until you close it. In that case you'll need to buffer the data as you read it. Instead of using memory for the entire contents of the socket (which may be infinite), you'll only use memory for the data you've actually read.

This code defines a BufferedIO class that adds data to an internal StringIO as it's read from its source:

	class 
BufferedIO
	 def initialize(io)
	 @buff = StringIO.new
	 @source = io
	 @pos = 0
	 end

	 def read(x=nil)
	 to_read = x ? to_read = x+@buff.pos-@buff.size : nil
	 _append(@source.read(to_read)) if !to_read or to_read > 0
	 @buff.read(x)
	 end

	 def pos=(x)
	 read(x-@buff.pos) if x > @buff.size
	 @buff.pos = x
	 end
	
	 def seek(x, whence=IO::SEEK_SET)
	 case whence
	 when IO::SEEK_SET then self.pos=(x)
	 when IO::SEEK_CUR then self.pos=(@buff.pos+x)
	 when IO::SEEK_END then read; self.pos=(@buff.size-x)
	 # Note: SEEK END reads all the socket data.
	 end
	 pos
	 end

	 # Some methods can simply be delegated to the buffer.
	 ["pos", "rewind", "tell"].each do |m|
	 module_eval "def #{m}
@buff.#{m}
end"
	 end

	 private

	 def _append(s)
	 @buff << s
	 @buff.pos -= s.size
	 end
	end

Now you can seek, rewind, and generally move around in an input socket as if it were a disk file. You only have to read as much data as you need:

	sock = TCPSocket.open("www.example.com", 80)
	sock.write("GET /
")
	file = BufferedIO.new(sock)

	file.read(10) # => "

0 file.read(10) # => " 90 file.read(15) # => " this web page " file.seek(-10, IO::SEEK_CUR) # => 95 file.read(10) # => " web page "

BufferedIO doesn't implement all the methods of IO, only the ones not implemented by socket-type IO objects. If you need the other methods, you should be able to implement the ones you need using the existing methods as guidelines. For instance, you could implement readline like this:

	class BufferedIO
	 def readline
	 oldpos = @buff.pos
	 line = @buff.readline unless @buff.eof?
	 if !line or line[-1] != ?

	 _append(@source.readline) # Finish the line
	 @buff.pos = oldpos # Go back to where we were
	 line = @buff.readline # Read the line again
	 end
	 line
	 end
	end

	file.readline # => "by typing "example.com",
"

 

See Also

  • Recipe 6.17, " Processing a Binary File," for more information on IO#seek


Strings

Numbers

Date and Time

Arrays

Hashes

Files and Directories

Code Blocks and Iteration

Objects and Classes8

Modules and Namespaces

Reflection and Metaprogramming

XML and HTML

Graphics and Other File Formats

Databases and Persistence

Internet Services

Web Development Ruby on Rails

Web Services and Distributed Programming

Testing, Debugging, Optimizing, and Documenting

Packaging and Distributing Software

Automating Tasks with Rake

Multitasking and Multithreading

User Interface

Extending Ruby with Other Languages

System Administration



Ruby Cookbook
Ruby Cookbook (Cookbooks (OReilly))
ISBN: 0596523696
EAN: 2147483647
Year: N/A
Pages: 399

Flylib.com © 2008-2020.
If you may any questions please contact us: flylib@qtcs.net