Matching Strings with Regular Expressions

Problem

You want to know whether or not a string matches a certain pattern.

Solution

You can usually describe the pattern as a regular expression. The =~ operator tests a string against a regular expression:

	string = 'This is a 30-character string.'

	if string =~ /([0-9]+)-character/ and $1.to_i == string.length
	 "Yes, there are #$1 characters in that string."
	end
	# => "Yes, there are 30 characters in that string."

You can also use Regexp#match:

	match = Regexp.compile('([0-9]+)-character').match(string)
	if match && match[1].to_i == string.length
	 "Yes, there are #{match[1]} characters in that string."
	end
	# => "Yes, there are 30 characters in that string."

You can check a string against a series of regular expressions with a case statement:

	string = "123"

	case string
	when /^[a-zA-Z]+$/
	 "Letters"
	when /^[0-9]+$/
	 "Numbers"
	else
	 "Mixed"
	end
	# => "Numbers"

 

Discussion

Regular expressions are a cryptic but powerful minilanguage for string matching and substring extraction. They've been around for a long time in Unix utilities like sed, but Perl was the first general-purpose programming language to include them. Now almost all modern languages have support for Perl-style regular expression.

Ruby provides several ways of initializing regular expressions. The following are all equivalent and create equivalent Regexp objects:

	/something/
	Regexp.new("something")
	Regexp.compile("something")
	%r{something}

The following modifiers are also of note.

Table 1-1.

Regexp::IGNORECASE

i

Makes matches case-insensitive.

Regexp::MULTILINE

m

Normally, a regexp matches against a single line of a string. This will cause a regexp to treat line breaks like any other character.

Regexp::EXTENDED

x

This modifier lets you space out your regular expressions with whitespace and comments, making them more legible.

Here's how to use these modifiers to create regular expressions:

	/something/mxi
	Regexp.new('something',
	 Regexp::EXTENDED + Regexp::IGNORECASE + Regexp::MULTILINE)
	%r{something}mxi

Here's how the modifiers work:

	case_insensitive = /mangy/i
	case_insensitive =~ "I'm mangy!" # => 4
	case_insensitive =~ "Mangy Jones, at your service." # => 0

	multiline = /a.b/m
	multiline =~ "banana
banana" # => 5
	/a.b/ =~ "banana
banana" # => nil
	# But note:
	/a
b/ =~ "banana
banana" # => 5

	extended = %r{  was # Match " was"
	 s # Match one whitespace character
	 a # Match "a" }xi
	extended =~ "What was Alfred doing here?" # => 4
	extended =~ "My, that was a yummy mango." # => 8
	extended =~ "It was


a fool's errand" # => nil

 

See Also

  • Mastering Regular Expressions by Jeffrey Friedl (O'Reilly) gives a concise introduction to regular expressions, with many real-world examples
  • RegExLib.com provides a searchable database of regular expressions (http://regexlib.com/default.aspx)
  • A Ruby-centric regular expression tutorial (http://www.regular-expressions.info/ruby.html)
  • ri Regexp
  • Recipe 1.19, "Validating an Email Address"


Strings

Numbers

Date and Time

Arrays

Hashes

Files and Directories

Code Blocks and Iteration

Objects and Classes8

Modules and Namespaces

Reflection and Metaprogramming

XML and HTML

Graphics and Other File Formats

Databases and Persistence

Internet Services

Web Development Ruby on Rails

Web Services and Distributed Programming

Testing, Debugging, Optimizing, and Documenting

Packaging and Distributing Software

Automating Tasks with Rake

Multitasking and Multithreading

User Interface

Extending Ruby with Other Languages

System Administration



Ruby Cookbook
Ruby Cookbook (Cookbooks (OReilly))
ISBN: 0596523696
EAN: 2147483647
Year: N/A
Pages: 399

Flylib.com © 2008-2020.
If you may any questions please contact us: flylib@qtcs.net