Fixing Your Test Suite: Class Methods and Memoization

Contact

Blog

0 pixels scrolled

Home

Work

Services

About

Blog

Contact

.css-18joud.e9603zb2.gbi--1863665248-dk7vgGMNP4aGL7d6efouVT:before { opacity: 1; background-image: linear-gradient(0deg, rgba(46, 43, 38, 0.4), rgba(46, 43, 38, 0.4)),url('/static/914baa30ebf0dcf65f39ff647acd56c7/83f35/cover.jpg'); }

Fixing Your Test Suite:
Class Methods and Memoization

Jared Norman on February 17, 2018

Fixing Your Test Suite:
Class Methods and Memoization

Jared Norman on February 17, 2018
Posted in Ruby

← Back to the blog

This is the first post in a series about common mistakes that lead to unreliable test suites in Ruby and how to fix them. Stay tuned for more.

Memoization is a helpful tool for optimization, but Rubyists use it for more than that. Take this class for, example:

class UserNotification
  def initialize(user)
    @user = user
  end

  def account_frozen
    return unless user.phone_number

    api_client.send(
      user.phone_number,
      "Your account has been frozen due to suspicious activity."
    )
  end

  private

  attr_reader :user

  def api_client
    @api_client ||= NotificationService::Client.new(
      key: ENV.fetch('NOTIFICATION_SERVICE_API_KEY')
    )
  end
end

In the class above, the #api_client private method is memoized so that it instantiates our notification service client when it’s called the first time and always returns the same client on subsequent calls.

Memoizing the setup for the API client isn’t strictly necessary. It probably isn’t very expensive to perform this tasks multiple times, so we could just remove the memoization altogether and reinstantiate the client on every call.

Alternatively we could just setup the API client in the constructor. We’d be setting it up even if we didn’t end up using it, but the penalty for doing so is insignificant.

I regularly see Rubyists memoize things when it isn’t strictly necessary, and I like it. In situations like this it keeps the constructor clean, isolating the boring implementation details in a private method at the very bottom of the class definition. A good class tells a story about how to use it, and this class tells you up front about what it does, and allows you to keep reading if you need to know how it does it.

There’s very few situations where you’ll consider using memoization and be wrong to do so. Memoization is reasonably sensible as long as these two criteria are met:

The method is called a variable number of times over the life an instance.
The method should always return the same result.

If your instinct is to compute some value lazily in your class, it’s probably just fine to do that. Some languages lazy evaluate pretty much everything. Just remember that memoizing a falsy value won’t do anything; it will be re-evaluated every time. (You can get around this.)

The Lifecycle of an Object

Forgetting about “the life of the object” is how we introduce the first test suite issue. Like everything in Ruby, classes are objects too. Class methods are no different than methods on any other instance. When you define a class method it looks like this:

class Example
  def self.foo
    3
  end

  # or equivalently:

  class << self
    def foo
      3
    end
  end
end

When it comes to memoization the important difference between class and instance methods is how long the object in question lives. Your application will create and throw away thousands or millions of instances of most classes over the course of one HTTP request/response cycle. On the other hand, the classes themselves live for the length of the Ruby process. This means memoization in class methods saves the memoized value until your app is restarted.

When It Works

Memoizing class methods is sometimes okay. Ignoring potential thread safety issues (a topic for another day), sometimes you do want to save a computed value for the length of your Ruby process.

For example, if you build a lookup table for tax rates from a CSV then there’s no reason not to keep it around. You probably don’t want to load it from disk and parse it every time you need to look up a tax rate, and if you’re loading this data from a CSV then it’s probably safe to assume that it doesn’t change very often.

When It Doesn’t

Where you get into trouble is when you memoize something that does change. A common one I see is memoizing some value that changes infrequently from the database. Not only will this almost assuredly break your test suite, but it will probably cause bugs in your application. Take a look:

module UserQueries
  class << self
    def active_with_subscription
      @active_and_subscribed ||=
        User.active.joins(:subscription).distinct
    end
  end
end

It’s a innocent looking piece of code, and in my experience people often end up with something like this by refactoring the query out of another class.

The issue is that once this query gets executed the result gets saved forever (until the Ruby process terminates.) The first time this method gets called in your test suite or application process the value will be recorded and returned for all subsequent calls, no matter how much the database changes. Consider this example RSpec spec:

require 'rails_helper'

RSpec.describe UserQueries do
  describe ".active_with_subscription" do
    subject { described_class.active_with_subscription }

    context "when there are active users with subscriptions" do
      let!(:user_one) { create :user, :active, :with_subscription }
      let!(:user_two) { create :user, :active, :with_subscription }
      it { is_expected.to contain_exactly(user_one, user_two) }
    end

    context "when there are no active users with subscriptions" do
      before do
        create :user, :active
        create :user, :inactive, :with_subscription
      end
      it { is_expected.to be_empty }
    end
  end
end

The example that runs second will always fail when run against the code above. The value returned by ActiveRecord will be saved and reused for the lifetime of the process. If the first example runs first then the method will return the two active users we created for that test when it gets called in the second example, even though they’ve likely been scrubbed by the database. If the second example runs first then the method will return an empty collection when the first example gets run.

You’ll likely never want to memoize the result of a query for the life of a process. Databases contain application state, and the point of state is that it changes.

How To Fix It

This is one of the easiest common mistakes to fix. Once you’ve determined that you’ve got some memoization where you shouldn’t, you simply remove the memoization.

# Before:
def active_with_subscription
  @active_and_subscribed ||=
    User.active.joins(:subscription).distinct
end

# After:
def active_with_subscription
  User.active.joins(:subscription).distinct
end

Careful though, you’ll need to performance test your code after you make a change like this. If you have some code that calls this a thousand times in one request then you’re going to need to find a way to cache the result at another level.

Memoization is a great technique for writing more readable and faster code. Don’t hesitate to use it, but pay attention while refactoring and don’t pull memoized logic out into the wrong context.

Work Services About Blog Careers Contact