Flylib.com

Books Software

 
 
 

Recipe 17.15. Who s Calling That Method? A Call Graph Analyzer


Recipe 17.15. Who's Calling That Method? A Call Graph Analyzer

Suppose you're profiling a program such as the one in Recipe 17.12, and the profiler says that the top culprit is Array#each . That is, your program spends more time iterating over arrays than doing any one other thing:

%    cumulative    self               self      total
	time    seconds    seconds   calls    ms/call   ms/call  name
	12.19      2.74       2.74    4930       0.56      0.77  Array#each

This points you in the right direction, but where do you go from here? Most programs are full of calls to Array#each . To optimize your program, you need to know which lines of code are responsible for most of the Array#each calls. Ruby's profiler can't give tell you which line of code called a problem method, but it's easy to write a different profiler that can.

The heart of any Ruby profiler is a Proc object passed into the Kernel#set_trace_func method. This is a hook into the Ruby interpreter itself: if you set a trace function, it's called every time the Ruby interpreter does something interesting like call a method.

Here's the start of a CallTracker class. It initializes a hash-based data structure that tracks "interesting" classes and methods . It assumes that we pass a method tally_calls into set_trace_func; we'll define tally_calls a little later.

class  
CallTracker

	  # Initialize and start the trace.
	  def initialize(show_stack_depth=1)
	    @show_stack_depth = show_stack_depth
	    @to_trace = Hash.new { h,k h[k] = {} }
	    start
	    at_exit { stop }
	end

	# Register a class/method combination as being interesting. Subsequent calls
	# to the method will be tallied by tally_calls.
	def register(klass, method_symbol)
	  @to_trace[klass][method_symbol] = {}
	end

	# Tells the Ruby interpreter to call tally_calls whenever it's about to
	# do anything interesting.
	def start
	  set_trace_func method(:tally_calls).to_proc
	end

	# Stops the profiler, and prints a report of the interesting calls made
	# while it was running.
	def stop(out=$stderr)
	  set_trace_func nil
	  report(out)
	end

Now let's define the missing methods tally_calls and report. The Proc object passed into set_trace_func needs to take six arguments, but this analyzer only cares about three of them:



event

Lets us know what the interpreter is doing. We only care about "call" and "c-call" events, which let us know that the interpreter is calling a Ruby method or a C method.



klass

The Class object that defines the method being called.



symbol

The name of the method as a Symbol .

The tally_calls method looks up the class and name of the method being called to see if it's one of the methods being tracked. If so, it grabs the current call stack with Kernel#caller , and notes where in the execution path the method was called:

# If the interpreter is about to call a method we find interesting,
	# increment the count for that method.
	def tally_calls(event, file, line, symbol, binding, klass)
	  if @to_trace[klass] and @to_trace[klass][symbol] and
	  (event == 'call' or event =='c-call')
	    stack = caller
	    stack = stack[1..(@show_stack_depth ? @show_stack_depth : stack.size)]
	    @to_trace[klass][symbol][stack] = 0
	    @to_trace[klass][symbol][stack] += 1
	  end
	end

All that's left is the method that prints the report. It sorts the results by execution path (as indicated by the stack traces), so the more often a method is called from a certain line of code, the higher in the report that line of code will show up:

# Prints a report of the lines of code that called interesting
	  # methods, sorted so that the the most active lines of code show up
	  # first.
	  def report(out=$stderr)
	    first = true
	    @to_trace.each do klass, symbols
	      symbols.each do symbol, calls
	        total = calls.inject(0) { sum, ct sum + ct[1] }
	        padding = total.to_s.size
	        separator = (klass.is_a? Class) ? '#' : '.'
	        plural = (total == 1) ? '' : 's'
	        stack_join = "\n" + (' ' * (padding+2))
	        first ? first = false : out.puts
	        out.puts "#{total} call#{plural} to #{klass}#{separator}#{symbol}"
	        (calls.sort_by { caller, times -times }).each do caller, times
	          out.puts " %#{padding}.d #{caller.join(stack_join)}" % times
	        end
	      end
	    end
	  end
	end

Here's the analyzer in action. It analyses my use of the Rubyful Soup HTML parser (which I was working on optimizing) to see which lines of code are responsible for calling Array#each . It shows three main places to look for optimizations:

require 'rubygems'
	require 'rubyful_soup'
	tracker = CallTracker.new
	tracker.register(Array, :each)

	BeautifulSoup.new(open('test.html') { f f.read })
	tracker.stop($stdout)
	# 4930 calls to Array#each
	# 1671 ./rubyful_soup.rb:715:in 'pop_to_tag'
	# 1631 ./rubyful_soup.rb:567:in 'unknown_starttag'
	# 1627 ./rubyful_soup.rb:751:in 'smart_pop'
	#    1 ./rubyful_soup.rb:510:in 'feed'

By default, the CallTracker shows only the single line of code that called the "interesting" method. You can get more of the call stack by passing a larger show_stack_depth into the CallTracker initializer.

See Also

  • Recipe 17.6, "Creating and Understanding Tracebacks"

  • Recipe 17.12, "Profiling Your Application"