When REXML parses a document, it respects the original whitespace of the documents text nodes. You want to make the document smaller by compressing extra whitespace.
Parse the document by creating a REXML::Document out of it. Within the Document constructor, tell the parser to compress all runs of whitespace characters:
require exml/document text = %{<a>Some whitespace</a> Some more } REXML::Document.new(text, { :compress_whitespace => :all }).to_s # => "<a>Some whitespace</a> Some more "
Sometimes whitespace within a document is significant, but usually (as with HTML) it can be compressed without changing the meaning of the document. The resulting document takes up less space on the disk and requires less bandwidth to transmit.
Whitespace compression doesn have to be all-or-nothing. REXML gives two ways to configure it. Instead of passing :all as a value for :compress_whitespace, you can pass in a list of tag names. Whitespace will only be compressed in those tags:
REXML::Document.new(text, { :compress_whitespace => %w{a} }).to_s # => "<a>Some whitespace</a> Some more "
You can also switch it around: pass in :respect_whitespace and a list of tag names whose whitespace you don want to be compressed. This is useful if you know that whitespace is significant within certain parts of your document.
REXML::Document.new(text, { :respect_whitespace => %w{a} }).to_s # => "<a>Some whitespace</a> Some more "
What about text nodes containing only whitespace? These are often inserted by XML pretty-printers, and they can usually be totally discarded without altering the meaning of a document. If you add :ignore_whitespace_nodes => :all to the parser configuration, REXML will simply decline to create text nodes that contain nothing but whitespace characters. Heres a comparison of :compress_whitespace alone, and in conjunction with :ignore_whitespace_nodes:
text = %{<a>Some text</a> Some more } REXML::Document.new(text, { :compress_whitespace => :all }).to_s # => " <a>Some text</a> Some more " REXML::Document.new(text, { :compress_ whitespace => :all, :ignore_ whitespace_nodes => :all }).to_s # => "<a>Some text</a>Some more "
By itself, :compress_ whitespace shouldn make a document less human-readable, but :ignore_whitespace_nodes almost certainly will.
Strings
Numbers
Date and Time
Arrays
Hashes
Files and Directories
Code Blocks and Iteration
Objects and Classes8
Modules and Namespaces
Reflection and Metaprogramming
XML and HTML
Graphics and Other File Formats
Databases and Persistence
Internet Services
Web Development Ruby on Rails
Web Services and Distributed Programming
Testing, Debugging, Optimizing, and Documenting
Packaging and Distributing Software
Automating Tasks with Rake
Multitasking and Multithreading
User Interface
Extending Ruby with Other Languages
System Administration