using papertrail to find weird problems

The problem

Recently I had a problem where thousands of customer line items in the HeavyInk.com e-commerce app were getting cancelled.

The code that cancelled line items was (a) a mess, and (b) spread all over the code base. The result was that there were several places where the bug could be hiding.

Some refactoring

The first thing I did was cleaned up the code, by replacing code in other modules like this:

  class LineItem < ActiveRecord::Base
    attr_protected       # <-- total access

      PENDING   = 1
      SHIPPED   = 2
      CANCELLED = 3
  end

  class Foo
   def Bar
     self.line_items.each { |li| li.update_attributes(:state => LineItem::CANCELLED) }
   end
 end

with

 class LineItem < ActiveRecord::Base

    attr_protected :state # <-- not as secure as it should be, but better

      PENDING   = 1
      SHIPPED   = 2
      CANCELLED = 3

    private_constant : PENDING
    private_constant :SHIPPED 
    private_constant :CANCELLED 

  def cancel
    update_attribute(:state, USER_CANCELLED)
  end

  end

  class Foo
   def Bar
     self.line_items.each(&:cancel)
   end
 end

this is much more hygenic in three different ways:

1) it stops knowledge of internal constants from leaking out
2) it stops direct manipulation of internal state
3) it gives us a single point of access on LineItem cancellations and allows us to do sanity checking there

Sadly, it was not enough, on its own, to fix the bug.

Enter paper_trail

While watching an unrelated railscast I saw a reference to paper_trail in passing.

Paper trail, for those not in the know, is a phenomenal gem that allows one to instrument an ActiveRecord class to automatically store checkpointed versions just before each edit.

Go read the docs!

Installation is simple. Add this to your Gemfile


gem 'paper_trail', '>= 3.0.0.beta1'

and run a migration.

…but I wanted a bit more. I needed to know what piece of code was disrupting my data.

For this I didn’t just want to see what the line items looked like before they got squashed; I wanted to see who was responsible for squashing them.

I added a migration


class AddMetadataToPapertrail < ActiveRecord::Migration
  def change
    add_column     :versions, :stack, :text, :null => true
    add_column     :versions, :server, :string, :null => true
    add_column     :versions, :uri,    :string, :null => true
  end
end

(why record the server? Because the same code base runs on multiple machines, all sharing one database).

and an initializer


module PaperTrail
  class Version < ActiveRecord::Base

    # TJIC extension of papertrail - I want to store stack traces (at least for now)
    # 27 Sep 2013
    #
    attr_accessible :stack, :uri, :server

    def get_stack() 
      YAML::load(stack)
    end

    def print_stack()
      puts "server = #{server}"
      puts "uri    = #{uri}"

      get_stack.each do |entry|
        puts "  * #{entry}"
      end
    end


  end
end

tweaked application_controller.rb a bit:


  # http://stackoverflow.com/questions/6307138/how-do-i-get-request-uri-in-model-in-rails
  #
  # set a global variable that can be used later, inside models, etc.
  #
  before_filter :set_request_as_global
  def set_request_as_global
     $request = request
  end  

and then instrumented my LineItem class thusly:


class LineItem < ActiveRecord::Base

  #----------
  # papertrail debugging 
  #----------
  has_paper_trail :meta => { :stack => :get_stack,
                             :uri   => :get_uri,
                             :server=> :get_server,
   }
  def get_uri()  
    # $request being set depends on a before_filter in 
    # application_controller.rb
    "#{$request.andand.protocol}#{$request.andand.host_with_port}#{$request.andand.fullpath}"
  end
  
  def get_server()
    # potentially faster than file-system based caching
    #
    $server ||= `hostname`.chomp
  end

  def get_stack() 
    trace = caller(0)
    trace.reject! { |ii| ii.match(/active|action|railties/) }
    YAML::dump(trace)
  end


end

The end result was that I deployed the code to a live environment, then waited for the problem to strike again.

When it did, I went into the console


rails console production

bad_li = LineItem...
bad_li.versions.last.print_stack

which printed out


  server = 'host3'
  uri    = "http://heavyink.com/me/subscription...."
  * /home/tjic/bus/hi/src/hiw/app/models/line_item.rb:202:in `get_stack'
  * /usr/local/rvm/gems/ruby-1.9.3-p429/gems/paper_trail-3.0.0.beta1/lib/paper_trail/has_paper_trail.rb:253:in `block in merge_metadata'
  * /usr/local/rvm/gems/ruby-1.9.3-p429/gems/paper_trail-3.0.0.beta1/lib/paper_trail/has_paper_trail.rb:244:in `each'
  * /usr/local/rvm/gems/ruby-1.9.3-p429/gems/paper_trail-3.0.0.beta1/lib/paper_trail/has_paper_trail.rb:244:in `merge_metadata'
  * /usr/local/rvm/gems/ruby-1.9.3-p429/gems/paper_trail-3.0.0.beta1/lib/paper_trail/has_paper_trail.rb:205:in `record_create'
  * /home/tjic/bus/hi/src/hiw/app/models/order.rb:274:in `block in add_products_to_existing'
  * /home/tjic/bus/hi/src/hiw/app/models/order.rb:270:in `each'
  * /home/tjic/bus/hi/src/hiw/app/models/order.rb:270:in `add_products_to_existing'
  * /home/tjic/bus/hi/src/hiw/app/models/order.rb:207:in `block in create_for_products'
  * /home/tjic/bus/hi/src/hiw/app/models/order.rb:201:in `create_for_products'
  * /home/tjic/bus/hi/src/hiw/app/models/subscription.rb:365:in `update_line_items'
  * /home/tjic/bus/hi/src/hiw/app/models/subscription.rb:626:in `factory_create'
  * /home/tjic/bus/hi/src/hiw/app/models/subscription_list.rb:57:in `add_subscribable'

and BAM – there it is. Line 274 of order.rb. Thousands of lines of unit, functional, and integration tests hadn’t caught it, but there was a weird confluence of almost-right scopes (some using lambdas) chained together in such a way that if a variable that we never expected to be null, we’d do a mammoth select, and then cancel far too many line items.

Problem solved.

…and I never could have done it without paper_trail !

This entry was posted in Uncategorized. Bookmark the permalink.