Don't Step on a Rake, Use Rake::DSL

Contact

Blog

0 pixels scrolled

Home

Work

Services

About

Blog

Contact

.css-18joud.e9603zb2.gbi--1863665248-am4AQfinYMkDujCqkdh5b9:before { opacity: 1; background-image: linear-gradient(0deg, rgba(46, 43, 38, 0.4), rgba(46, 43, 38, 0.4)),url('/static/89d1287193c4e6145aac7684c35f3e6d/3832f/splash.jpg'); }

Don't Step on a Rake,
Use Rake::DSL

Jared Norman on July 17, 2018

I've noticed a common mistake where developers refactor Rake tasks and end up causing some unintended side-effects. I take a look at what's going on, and how we can fix the problem.

Don't Step on a Rake,
Use Rake::DSL

Jared Norman on July 17, 2018
Posted in Ruby

← Back to the blog

Over the years I’ve noticed a common mistake where developers do some refactoring to remove duplication and make their Rake tasks readable, but end up causing some unintended side-effects. Let’s take a look at what’s going on, and how we can use a built-in feature of Rake to fix the problem.

Rake

Rake is a general purpose make-like task runner for Ruby. Almost all Ruby projects use it as a task runner, and it comes baked into new Rails projects. Most applications will eventually need their own custom tasks.

In a Rails app you might find some custom Rake tasks for running scheduled jobs like snapshots or nightly processing tasks. It’s also usually preferred to perform data migration in Rake tasks rather than Rails migrations.

In a Ruby on Rails application, you put your new tasks in lib/tasks. (You also have to add the .rake extension instead of .rb, which is extremely easy to forget.) Let’s say we need some tasks for generating example data to develop against in a music managment app. We might write something like this:

# lib/tasks/dev_data.rake

namespace :dev_data do
  desc "Create some randomly generated music albums"
  task :generate_albums => :environment do
    # Task code goes here.
  end

  desc "Create some randomly generated music artists"
  task :generate_artists => :environment do
    # Task code goes here.
  end

  desc "Create some randomly generated music labels"
  task :generate_labels => :environment do
    # Task code goes here.
  end
end

This defines three tasks in a namespace called dev_data, named generate_albums, generate_artists, and generate_labels respectively. These tasks all depend on another task (it will run before the custom task) called environment which is provided by Rails and loads our application code so we will be able to access our models and other code in these tasks.

Using Rake tasks like this is a little better than throwing scripts in the bin folder because they’re easier to discover for new developers, as you can ask Rake for all the tasks that have descriptions. You’ll get all the tasks from Rails, any gems providing tasks, and your custom ones.

$ bundle exec rake -T
[...]
rake dev_data:generate_albums   # Create some randomly generated music albums
rake dev_data:generate_artists  # Create some randomly generated music artists
rake dev_data:generate_labels   # Create some randomly generated music labels
[...]

Refactoring Test Setup

The problem I’ve been seeing happens when developers apply a common refactoring in RSpec or Minitest to their Rake tasks. Let’s say we have a class called Bicycle and to get it ready to ride, it needs some assembly or setup. In order to write some tests we’re going to need to do that assembly. In RSpec, the specs might look something like this:

RSpec.describe "Bicycle" do
  let(:bicycle) { Bicycle.new }

  before do
    # Do some stuff to set up the bicycle.
  end

  it "has two wheels" do
    # ...
  end

  it "has brakes" do
    # ...
  end

  it "has handlebars" do
    # ...
  end
end

Now if we introduce some tests that don’t need to do the setup, then we have some decisions to make. We can use contexts, but for simplicity let’s just leave our spec flat and pull out the shared setup logic into a method.

RSpec.describe "Bicycle" do
  let(:bicycle) { Bicycle.new }

  it "comes disassembled" do
    # ...
  end

  it "has two wheels" do
    put_the_bicycle_together
    # ...
  end

  it "has brakes" do
    put_the_bicycle_together
    # ...
  end

  it "has handlebars" do
    put_the_bicycle_together
    # ...
  end

  def put_the_bicycle_together
    # Do some stuff to set up the bicycle.
  end
end

Now this refactoring is fine, and I see it used relatively often. It works because if you’re using minitest/spec or Rspec, then those describe calls are actually creating classes under the hood, and the block we provided is being run in the context of those classes using class_eval. Try running something like this:

RSpec.describe "Bicycle" do
  puts self
end
# It prints out:
RSpec::ExampleGroups::Bicycle

Rake Doesn’t Do That

The refactoring we just covered kept our tests clean, and gave a descriptive name to some shared test setup logic, so we might be inclined to do something similar in our Rake tasks. I’ve done it before, and I see people do it all the time. Unfortunately, it’s not a very safe thing to do, and could have unintended side effects.

Let’s take a look. If Rake behaved the same way RSpec behaves, we should be able to write something like this:

# lib/tasks/bicycle

namespace :bicycle do
  task :assemble do
    bicycle = Bicycle.new

    # Assemble the bicycle:
    attach_wheels(bicycle)
    attach_handlebars(bicycle)
    attach_brakes(bicycle)
  end

  def attach_wheels(bicycle)
    #...
  end

  def attach_handlebars(bicycle)
    # ...
  end

  def attach_brakes(bicycle)
    # ...
  end
end

If you do try that out, you’ll find that it works. You might, satisfied with your new bicycle assembly Rake task, commit this and move on to more pressing matters.

Unfortunately, Rake doesn’t do anything to change the context that these blocks are executed in, as we saw minitest/spec and Rspec do. You just defined a bunch of private methods on the Object class.

Sideshow Bob getting hit in the face with a rake repeatedly

There are two important pieces to what makes up a given context in Ruby: the value of self and the “current class”. When you use def in Ruby, methods get defined on the current class.

At the root context of a Ruby program the value of self is main (a special instance of the Object class) and the current class is Object. That means any methods we define there end up on Object, which nearly all classes inherit from. Specifically, they’ll end up as private methods on Object.

namespace :bicycle do
  task :assemble do
    # ...
  end

  def attach_handlebars(bicycle)
    # ...
  end
end

# These now all work:
bicycle = Bicycle.new
attach_handlebars(bicycle)
"a string".send(:attach_handlebars, bicycle)
1337.send(:attach_handlebars, bicycle)
Class.send(:attach_handlebars, bicycle)

That’s correct: every instance of almost every class, including the classes and modules themselves, now have a private attach_handlebars method. Polluting almost every object in your system with unnecessary methods is a bad practice and could have a variety of consequences.

One problem you could run into without even doing any metaprogramming would be a naming collision. If you happened to be assembling some motorcycles as well as bicycles, and those motorbikes needed their handlebars attached too, suddenly you might be trying to attach motorcycle handlebars to your bicycles or vice versa, because the attach_handlebars method that got defined second would overwrite the first.

Use Service Classes Instead

Generally speaking, it’s good practice to pull the logic of your Rake tasks out into a class. This allows you to do all the normal refactoring you’d do in any other object, like extracting out methods without accidentally polluting the global scope.

Additionally, it’ll be easier to write tests for the class, and even pull it into application code if you one day need to. Even if you’re just writing a throwaway task that you’re going to delete in a week, you can define the class inline in the .rake file for easy deletion.

Better Yet: Rake::DSL

Fortunately for us, Rake actually provides a built-in facility for changing the scope of our tasks! It lives in the Rake::DSL module. From the docs:

DSL is a module that provides task, desc, namespace, etc. Use this when you’d like to use rake outside the top level scope.

Using Rake outside of top-level scope is exactly what we need to do. We can create a class for our extra methods to live on, and instantiating that class can define our tasks for us.

# lib/tasks/bicycle.rake

class BicycleTasks
  include Rake::DSL

  def initialize
    namespace :bicycle do
      task :assemble do
        bicycle = Bicycle.new

        # Assemble the bicycle:
        attach_wheels(bicycle)
        attach_handlebars(bicycle)
        attach_brakes(bicycle)
      end
    end
  end

  private

  def attach_wheels(bicycle)
    # ...
  end

  def attach_handlebars(bicycle)
    # ...
  end

  def attach_brakes(bicycle)
    # ...
  end
end

# Instantiate the class to define the tasks:
BicycleTasks.new

Now we can define any methods we want, include other mixins, and do anything else we would normally do with a class, all without polluting Object.

Rake::DSL is your friend! It’s good for more than just isolating some methods, too. You can use it for more advanced techniques like dynamically generating tasks, making customizable tasks, and more. If you maintain any gems that provide Rake tasks to their users, consider making sure you’re not accidentally exporting methods on Object, and use Rake::DSL to fix it if you are!

Work Services About Blog Careers Contact