Who s Calling That Method? A Call Graph Analyzer

Table of contents:

See Also

Suppose you e profiling a program such as the one in Recipe 17.12, and the profiler says that the top culprit is Array#each. That is, your program spends more time iterating over arrays than doing any one other thing:

	 % cumulative self self total
	time seconds seconds calls ms/call ms/call name
	12.19 2.74 2.74 4930 0.56 0.77 Array#each

This points you in the right direction, but where do you go from here? Most programs are full of calls to Array#each. To optimize your program, you need to know which lines of code are responsible for most of the Array#each calls. Rubys profiler can give tell you which line of code called a problem method, but its easy to write a different profiler that can.

The heart of any Ruby profiler is a Proc object passed into the Kernel#set_trace_func method. This is a hook into the Ruby interpreter itself: if you set a trace function, its called every time the Ruby interpreter does something interesting like call a method.

Heres the start of a CallTracker class. It initializes a hash-based data structure that tracks "interesting" classes and methods. It assumes that we pass a method tally_calls into set_trace_func; well define tally_calls a little later.

	class 
CallTracker

	 # Initialize and start the trace.
	 def initialize(show_stack_depth=1)
	 @show_stack_depth = show_stack_depth
	 @to_trace = Hash.new { |h,k| h[k] = {} }
	 start
	 at_exit { stop }
	end

	# Register a class/method combination as being interesting. Subsequent calls
	# to the method will be tallied by tally_calls.
	def register(klass, method_symbol)
	 @to_trace[klass][method_symbol] = {}
	end

	# Tells the Ruby interpreter to call tally_calls whenever its about to
	# do anything interesting.
	def start
	 set_trace_func method(:tally_calls).to_proc
	end

	# Stops the profiler, and prints a report of the interesting calls made
	# while it was running.
	def stop(out=$stderr)
	 set_trace_func nil
	 report(out)
	end

Now lets define the missing methods tally_calls and report. The Proc object passed into set_trace_func needs to take six arguments, but this analyzer only cares about three of them:

event: Lets us know what the interpreter is doing. We only care about "call" and "c-call" events, which let us know that the interpreter is calling a Ruby method or a C method.
klass: The Class object that defines the method being called.
symbol: The name of the method as a Symbol.

The tally_calls method looks up the class and name of the method being called to see if its one of the methods being tracked. If so, it grabs the current call stack with Kernel#caller, and notes where in the execution path the method was called:

	# If the interpreter is about to call a method we find interesting,
	# increment the count for that method.
	def tally_calls(event, file, line, symbol, binding, klass)
	 if @to_trace[klass] and @to_trace[klass][symbol] and
	 (event == call or event ==c-call)
	 stack = caller
	 stack = stack[1..(@show_stack_depth ? @show_stack_depth : stack.size)]
	 @to_trace[klass][symbol][stack] ||= 0
	 @to_trace[klass][symbol][stack] += 1
	 end
	end

All thats left is the method that prints the report. It sorts the results by execution path (as indicated by the stack traces), so the more often a method is called from a certain line of code, the higher in the report that line of code will show up:

	 # Prints a report of the lines of code that called interesting
	 # methods, sorted so that the the most active lines of code show up
	 # first.
	 def report(out=$stderr)
	 first = true
	 @to_trace.each do |klass, symbols|
	 symbols.each do |symbol, calls|
	 total = calls.inject(0) { |sum, ct| sum + ct[1] }
	 padding = total.to_s.size
	 separator = (klass.is_a? Class) ? # : .
	 plural = (total == 1) ? \ : s
	 stack_join = "
" + (  * (padding+2))
	 first ? first = false : out.puts
	 out.puts "#{total} call#{plural} to #{klass}#{separator}#{symbol}"
	 (calls.sort_by { |caller, times| -times }).each do |caller, times|
	 out.puts " %#{padding}.d #{caller.join(stack_join)}" % times
	 end
	 end
	 end
	 end
	end

Heres the analyzer in action. It analyses my use of the Rubyful Soup HTML parser (which I was working on optimizing) to see which lines of code are responsible for calling Array#each. It shows three main places to look for optimizations:

	require 
ubygems
	require 
ubyful_soup
	tracker = CallTracker.new
	tracker.register(Array, :each)

	BeautifulSoup.new(open(	est.html) { |f| f.read })
	tracker.stop($stdout)
	# 4930 calls to Array#each
	# 1671 ./rubyful_soup.rb:715:in pop_to_tag
	# 1631 ./rubyful_soup.rb:567:in unknown_starttag
	# 1627 ./rubyful_soup.rb:751:in smart_pop
	# 1 ./rubyful_soup.rb:510:in feed

By default, the CallTracker shows only the single line of code that called the "interesting" method. You can get more of the call stack by passing a larger show_stack_depth into the CallTracker initializer.

See Also