Suppose you e profiling a program such as the one in Recipe 17.12, and the profiler says that the top culprit is Array#each. That is, your program spends more time iterating over arrays than doing any one other thing:
% cumulative self self total time seconds seconds calls ms/call ms/call name 12.19 2.74 2.74 4930 0.56 0.77 Array#each
This points you in the right direction, but where do you go from here? Most programs are full of calls to Array#each. To optimize your program, you need to know which lines of code are responsible for most of the Array#each calls. Rubys profiler can give tell you which line of code called a problem method, but its easy to write a different profiler that can.
The heart of any Ruby profiler is a Proc object passed into the Kernel#set_trace_func method. This is a hook into the Ruby interpreter itself: if you set a trace function, its called every time the Ruby interpreter does something interesting like call a method.
Heres the start of a CallTracker class. It initializes a hash-based data structure that tracks "interesting" classes and methods. It assumes that we pass a method tally_calls into set_trace_func; well define tally_calls a little later.
class CallTracker # Initialize and start the trace. def initialize(show_stack_depth=1) @show_stack_depth = show_stack_depth @to_trace = Hash.new { |h,k| h[k] = {} } start at_exit { stop } end # Register a class/method combination as being interesting. Subsequent calls # to the method will be tallied by tally_calls. def register(klass, method_symbol) @to_trace[klass][method_symbol] = {} end # Tells the Ruby interpreter to call tally_calls whenever its about to # do anything interesting. def start set_trace_func method(:tally_calls).to_proc end # Stops the profiler, and prints a report of the interesting calls made # while it was running. def stop(out=$stderr) set_trace_func nil report(out) end
Now lets define the missing methods tally_calls and report. The Proc object passed into set_trace_func needs to take six arguments, but this analyzer only cares about three of them:
event
Lets us know what the interpreter is doing. We only care about "call" and "c-call" events, which let us know that the interpreter is calling a Ruby method or a C method.
klass
The Class object that defines the method being called.
symbol
The name of the method as a Symbol.
The tally_calls method looks up the class and name of the method being called to see if its one of the methods being tracked. If so, it grabs the current call stack with Kernel#caller, and notes where in the execution path the method was called:
# If the interpreter is about to call a method we find interesting, # increment the count for that method. def tally_calls(event, file, line, symbol, binding, klass) if @to_trace[klass] and @to_trace[klass][symbol] and (event == call or event ==c-call) stack = caller stack = stack[1..(@show_stack_depth ? @show_stack_depth : stack.size)] @to_trace[klass][symbol][stack] ||= 0 @to_trace[klass][symbol][stack] += 1 end end
All thats left is the method that prints the report. It sorts the results by execution path (as indicated by the stack traces), so the more often a method is called from a certain line of code, the higher in the report that line of code will show up:
# Prints a report of the lines of code that called interesting # methods, sorted so that the the most active lines of code show up # first. def report(out=$stderr) first = true @to_trace.each do |klass, symbols| symbols.each do |symbol, calls| total = calls.inject(0) { |sum, ct| sum + ct[1] } padding = total.to_s.size separator = (klass.is_a? Class) ? # : . plural = (total == 1) ? \ : s stack_join = " " + ( * (padding+2)) first ? first = false : out.puts out.puts "#{total} call#{plural} to #{klass}#{separator}#{symbol}" (calls.sort_by { |caller, times| -times }).each do |caller, times| out.puts " %#{padding}.d #{caller.join(stack_join)}" % times end end end end end
Heres the analyzer in action. It analyses my use of the Rubyful Soup HTML parser (which I was working on optimizing) to see which lines of code are responsible for calling Array#each. It shows three main places to look for optimizations:
require ubygems require ubyful_soup tracker = CallTracker.new tracker.register(Array, :each) BeautifulSoup.new(open( est.html) { |f| f.read }) tracker.stop($stdout) # 4930 calls to Array#each # 1671 ./rubyful_soup.rb:715:in pop_to_tag # 1631 ./rubyful_soup.rb:567:in unknown_starttag # 1627 ./rubyful_soup.rb:751:in smart_pop # 1 ./rubyful_soup.rb:510:in feed
By default, the CallTracker shows only the single line of code that called the "interesting" method. You can get more of the call stack by passing a larger show_stack_depth into the CallTracker initializer.
Strings
Numbers
Date and Time
Arrays
Hashes
Files and Directories
Code Blocks and Iteration
Objects and Classes8
Modules and Namespaces
Reflection and Metaprogramming
XML and HTML
Graphics and Other File Formats
Databases and Persistence
Internet Services
Web Development Ruby on Rails
Web Services and Distributed Programming
Testing, Debugging, Optimizing, and Documenting
Packaging and Distributing Software
Automating Tasks with Rake
Multitasking and Multithreading
User Interface
Extending Ruby with Other Languages
System Administration