My Profile Photo

My personal blog. Software, devops, security, music, and general nerditry.

Debugging memory leaks in Ruby with GDB, round 2.

In part 1, I described how I located leaky Sets in MongoMapper by diffing the Ruby ObjectSpace with GDB. Today, I’m going to show you how to solve the problems that those sorts of diffs can reveal. In today’s example, we’re tracking leaky sets. In particular, a set is holding onto class references. We are going to:

  1. Set up object creation tracking on the Set class
  2. Find the leaky instance of our set using GDB
  3. Locate what code created that Set instance

First things first. Check out Sean Bradly’s GCTrack module. I’m using a modified version of it to include object IDs in the tracker struct, so that we can perform reverse-lookups on an instance.

# Sean Bradly (rhythmx at gmail) - 2010
# Modified by Chris Heald ( - August 2010
module GCTrack
  # Class to track an object's lifecycle and where it was created
  class Tracepoint <, :call, :insp, :id)
    include Comparable
    # make it so the obj.num works as a simple key
    def <=>(obj)
      if obj.class <= self.class
        self.num <=> obj.num
        self.num <=> obj

  def self.included(klass)

    # Hook initialize to track this object
    klass.class_eval do
      alias_method :old_init, :initialize

      def initialize(*args)
        # Call original init
        # trace creation and setup finalizer callback
        final_proc = self.class.gct_setup_finalizer(self, caller, inspect)
        # define the finalizer
        ObjectSpace.define_finalizer(self, final_proc)

    # Add some metaclass foo to track all the objects.
    # You cant track in an instance w/o creating circular
    # references and breaking Ruby's GC

    class << klass;
    end).class_eval do
      def gct_setup_finalizer(obj, caller, insp)
        # create/log a tracepoint for this object
        num = gct_objnum
        tpoint =, caller, insp, obj.object_id)
        # call option init cb
        if self.respond_to?(:created)
          self.send(:created, tpoint)
        obj = nil # don't let the proc track a ref to 'obj'
        # this callback happens at GC time
        final_proc = do
          # call optional final cb
          if self.respond_to?(:deleted)
            self.send(:deleted, tpoint)

      def gct_objnum
        @num ||= -1
        @num  +=  1

      def gct_addlog
        @addlog ||= []

      def gct_dellog
        @dellog ||= []

      def gct_add(num)
        gct_addlog << num

      def gct_del(num)
        gct_dellog << num

      def gct_active
        gct_addlog - gct_dellog

      def gct_origin(id)
        x = {|g| == id }.first
        x.nil? ? nil :"\n")

      def gct_orphan_report
        # Get the leaked objects
        rpt     = ""
        leaked  = gct_active
        callers = { |o| }.sort.uniq
        # Iterate over source lines that have leaked objs
        callers.each do |calla|
          leaked_here = leaked.find_all { |o| == calla }
          rpt << "==== #{leaked_here.size} leaked objects from:\n\n"
          calla.each { |l| rpt << "    #{l}\n" }
          rpt << "\n"
          rpt << "    == object data:\n\n"

          leaked_here.each do |o|
            rpt << "    num: #{o.num}, inspect: #{o.insp}\n"
          rpt << "\n"


Additionally, you’ll want to patch the object you’re interested in tracking.

class Set
  include GCTrack

You’ll want to include both of those at the top of your config.rb, before your initializer block. If you’re tracking an object defined in your Rails app, just include GCTrack in the object definition, rather than in a monkeypatch.

Next, you need to be sure that you have your GDB macros set up properly. /root/.gdbinit should have the following:

define eval

define redirect_stdout
  call rb_eval_string("$_old_stdout, $stdout = $stdout,'/tmp/ruby-debug.' +, 'a'); $stdout.sync = true")

(re)start your application, hit the leaky action a few times, and find its PID using ps ax or top. Once you have that, attach to the process with gdb, redirect your process’s stdout to the tmp file with redirect_stdout and dump all Sets in your object space.

(gdb) attach 12345
(gdb) redirect_stdout
$8 = 2
(gdb) eval "GC.start"
(gdb) eval "ObjectSpace.each_object {|o| puts \"#{}, #{o.inspect} -- #{o.object_id}\" if o.is_a?(Set) }; puts '----'"

Now that we have that, let’s look at the tmp file, and locate the leaked set instance.

Set, #<Set: {Achievement, Achievement, EmbeddedComment, Achievement, EmbeddedComment, EmbeddedComment}> -- 78376680

There, at the end, is our object ID. Now back over to gdb.

(gdb) eval "puts {|g| == 78376680}\"\n\")"

Your tmp file should now have a stack trace for that object’s allocation now. Track it down and beat it into submission.

/opt/ruby-enterprise-1.8.7-2010.02/lib/ruby/gems/1.8/gems/mongo_mapper-0.8.3/lib/mongo_mapper/support/descendant_appends.rb:15:in `new'

You can do this for just about any object you’d like. Just mix in your GCTrack and easily find where in code a particular instance of an object was created. Just like that, debugging memory leaks goes from hunting guppies in the Atlantic to shooting fish in a barrel.

Have fun!

comments powered by Disqus