Sorry this page looks weird. It was automatically migrated from my old blog, which had a different layout and different CSS.

Seeding Your Rails Database With Reference Data

Migrations: Not Guaranteed To Work

Migrations are an excellent way to evolve your Rails app’s database schema in step with the models. However if you fall more than one migration behind, there is no guarantee that you will be able to execute the outstanding ones successfully.

This is because when a migration starts it instantiates the corresponding model (update: if you ask it to). Since models are not versioned as migrations are, you will have the latest model with the pre-migration schema. The migration will fail if the two are incompatible.

This has caused consternation in some quarters.

Consequently the recommended way to create a new database at any time is to load the schema (db/schema.rb). A Rake task is thoughtfully provided to do just this: rake db:schema:load.

What Are Reference Data?

Most applications need some reference data (a.k.a. basic data). You can think of this as background knowledge for your app. It’s generally not the data your users create, though they may update it from time to time; it’s data on which their data depends.

I worked at one of Europe’s largest hedge funds for four years building heaps of software. Throughout that time, indeed to this day, the one project that never seemed to be finished was the Reference Data Project. Currencies, exchanges, financial instruments, counterparties, holidays, contract notice dates… they’re all background facts that the systems need to know in order to trade.

Loading Reference Data

Until recently when I noticed that migrations aren’t guaranteed to run all the way through, I used my migrations to create my reference data. A migration defining a countries table would then execute a few Country.creates. I was combining data definition with data loading.

Now we know we should use a different mechanism for data definition, we need to separate the data loading. Ideally I’d like to write fixtures for my reference data and load those into the database, having first loaded the schema.

Although Rails does provide a way to load fixtures — rake db:fixtures:load — it loads your test fixtures. This isn’t what we want: test data and reference data are different kettles of ball games (thanks Dr Brown!) and should not be conflated.

Surprisingly Mr Google thought I was the only person in the world with this problem so I wrote my own Rake task to load fixtures from the db/basic_data directory. Invoke it like this:

$ rake db:fixtures:basic_data

Optionally you can pass FIXTURES=x,y to specify which fixtures to load, just like Rails' rake db:fixtures:load.

Prior Art

The Manage Fixtures plugin doesn’t load reference data the way I have just described but it’s jolly useful in the right circumstances. A few months ago I helped somebody move their Rails app off a shared host; the host didn’t allow one to run mysqldump to dump the database — but Manage Fixtures would have got the data out.

[12th Februrary 2008] I just found another article proposing the same solution. Great minds think alike.

Subsequent Art

The problem with the method proposed above is the lack of validation. This alternative resolves that.


Migrations don’t load your AR models unless you explicitly try to use the models in the migration. In general (for the reasons you pointed out), you should never use outside code inside you migrations. The right way to do this is shown in the rails recipes book: redefine your AR models at the top of your migrations.

Tammer Saleh • 26 January 2008

Tammer, thanks for correcting my statement on model instantiation within a migration. I’ve updated the text. Thanks too for pointing out that Rails Recipe (no. 30) — I had forgotten it.

Andy Stewart • 26 January 2008

Your Rake task link is outdated.

Stefan Schüßler • 11 October 2008

Thanks, Stefan. I’ve updated the article.

Andy Stewart • 11 October 2008

Andrew Stewart • 25 January 2008 • Rails
You can reach me by email or on Twitter.