sstarr / trucker

Helper for migrating legacy data

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Trucker

Trucker is a gem that helps migrate legacy data into a Rails app.

Installation

  1. Install the trucker gem

    sudo gem install trucker
    
  2. Add trucker to your config.gem block in environment.rb

    config.gem "trucker"
    
  3. Generate the basic trucker files

    rails generate truck
    
    This will do the following things:
    * Add legacy adapter to database.yml
    * Add legacy base class
    * Add legacy sub classes for all existing models
    * Add app/models/legacy to autoload_paths in Rails Initializer config block
    * Generate sample migration task (using pluralized model names)
  4. Update the legacy database adapter in database.yml with your legacy database info

    legacy:
      adapter: mysql
      encoding: utf8
      database: app_legacy
      username: root
      password:
    
    (By convention, we recommend naming your legacy database APP_NAME_legacy.)
  5. If the legacy database doesn’t already exist, add it.

    rake db:create:all
  6. Import your legacy data into the legacy database

    mysql -u root app_legacy < old_database.sql
  7. Update set_table_name in each of your legacy models as needed

    class LegacyPost < LegacyBase
      set_table_name "LEGACY_TABLE_NAME_GOES_HERE"
    end
    
  8. Update legacy model field mappings as needed

    class LegacyPost < LegacyBase
      set_table_name "YOUR LEGACY TABLE NAME GOES HERE"
    
      def map
        {
          :headline => self.title.squish,
          :body => self.long_text.squish
        }
      end
    end
    
    New model attributes on the left side.
    Legacy model attributes on the right side.
    (aka :new_field => legacy_field)
    
  9. Need to tweak some data? Just add some core ruby methods or add a helper method.

    class LegacyPost < LegacyBase
      set_table_name "YOUR LEGACY TABLE NAME GOES HERE"
    
      def map
        {
          :headline => self.title.squish.capitalize, # <= Added capitalize method
          :body => tweak_body(self.long_text.squish) # <= Added tweak_body method
        }
      end
    
      # Insert helper methods as needed
      def tweak_body(body)
        body = body.gsub(/<br \//,"\n") # <= Convert break tags into normal line breaks
        body = body.gsub(/teh/, "the")  # <= Fix common typos
      end
    end
    
  10. Start migrating!

    rake db:migrate:posts

Migration command line options

Trucker supports a few command line options when migrating records:

rake db:migrate:posts limit=100 # migrates 100 records
rake db:migrate:posts limit=100 offset=100 # migrates 100 records, but skips first 100 records
rake db:migrate:users limit=1 where=":username => 'gilesgoatboy'" # migrates just one record

You can pass just about anything to a where clause, but checking it in the console ahead of time is strongly recommended. Most obvious use case: migrating just one record to verify migrating happens the way you want it to.

Custom migration labels

You can tweak the default migration output generated by Trucker by using the :label option.

rake db:migrate:posts
=> Migrating posts

rake db:migrate:posts, :label => "blog posts"
=> Migrating blog posts

Custom helpers

Trucker is intended for migrating data from fairly simple web apps that started life on PHP, Perl, etc. So, if you’re migrating data from an enterprise system, this may not be your best choice.

That said, if you need to pull off a complex migration for a model, you can use a custom helper method to override Trucker’s default migrate method in your rake task.

namespace :db do
  namespace :migrate do

    ...

    desc 'Migrate pain_in_the_ass model'
    task :pain_in_the_ass => :environment do
      Trucker.migrate :pain_in_the_ass, :helper => pain_in_the_ass_migration
    end 

  end
end

def pain_in_the_ass_migration
  # Custom code goes here
end

Then just copy the migrate method from lib/trucker.rb and tweak accordingly.

As an example, here’s a custom helper used to migrate join tables on a bunch of models.

namespace :db do
  namespace :migrate do

    desc 'Migrates join tables'
    task :joins => :environment do
      migrate :joins, :helper => :migrate_joins  
    end

  end
end

def migrate_joins
  puts "Migrating #{number_of_records || "all"} joins #{"after #{offset_for_records}" if offset_for_records}"

  ["chain", "firm", "function", "style", "website"].each do |model|

    # Start migration
    puts "Migrating theaters_#{model.pluralize}"

    # Delete existing joins
    ActiveRecord::Base.connection.execute("TRUNCATE table theaters_#{model.pluralize}")

    # Tweak model ids and foreign keys to match model syntax
    if model == 'website'
      model_id = "url_id"
      send_foreign_key = "url_id".to_sym
    else
      model_id = "#{model}_id"
      send_foreign_key = "#{model}_id".to_sym
    end

    # Create join object class
    join = Object.const_set("Theaters#{model.classify}", Class.new(ActiveRecord::Base))

    # Set model foreign key
    model_foreign_key = "#{model}_id".to_sym

    # Migrate join (unless duplicate)
    "LegacyTheater#{model.classify}".constantize.find(:all, with(:order => model_id)).each do |record|

      unless join.find(:first, :conditions => {:theater_id => record.theater_id, model_foreign_key => record.send(send_foreign_key)})
        attributes = {
          model_foreign_key => record.send(send_foreign_key),
          :theater_id => record.theater_id
        }

        # Check if theater chain is current
        attributes[:is_current] = {'Yes' => 1, 'No' => 0, '' => 0}[record.current] if model == 'chain'

        # Migrate join
        join.create(attributes)
      end
    end
  end
end

ProTip: Catching Validation Errors

Here’s the standard ‘migrate` method in `LegacyBase`:

def migrate
  new_record = self.class.to_s.gsub(/Legacy/,'::').constantize.new(map)
  new_record[:id] = self.id
  new_record.save
end

And here’s a handy variant:

def migrate
  new_record = self.class.to_s.gsub(/Legacy/,'::').constantize.new(map)
  new_record[:id] = self.id
  begin
    new_record.save!
  rescue Exception => e
    puts "error saving #{new_record.class} #{new_record.id}!"
    puts e.inspect
  end 
end

Note that this version responds to failed saves by simply logging it to ‘stderr` and going on to the next one like Jay-Z. In practice, this might not be what you want. Your mileage may vary, so season to taste.

Sample application

Check out the Trucker sample app for a working example of Trucker-based legacy data migration.

Rails 3.1 Compatibility

Adapted Trucker itself, but not its generators, to Rails 3.1 compatibility. Further work remains to bring the generators up to date. If you use Trucker with the current (June 2011) Rails 3.1 release candidate and MySQL, make sure you use the latest mysql2 gem (currently 0.3.2), as other versions, or the original mysql gem, will cause explosive pants-shitting failure. (Postgres and SQLite are fine.)

Rails 3.1 compatible code lives on the rails31 branch. The functionality’s refactored to an object- oriented model in the refactor branch.

Background

Trucker is based on a migration technique using legacy models first pioneered by Dave Thomas: pragdave.blogs.pragprog.com/pragdave/2006/01/sharing_externa.html

Note on patches/pull requests

  • Fork the project.

  • Make your feature addition or bug fix.

  • Add tests for it. This is important so we don’t break a future version unintentionally.

  • Commit your changes, but do not mess with the rakefile, version, or history. (if you want to have your own version, that is fine but bump version in a commit by itself so we can ignore when we pull)

  • Send a pull request. Bonus points for topic branches.

Contributors

  • Patrick Crowley / mokolabs

  • Rob Kaufman / notch8

  • Jordan Fowler / TheBreeze

  • Roel Bondoc / roelbondoc

  • Giles Bowkett / gilesbowkett (GitHub) / gilesgoatboy (Twitter)

Copyright © 2010 Patrick Crowley and Rob Kaufman. See LICENSE for details.

About

Helper for migrating legacy data

License:MIT License


Languages

Language:Ruby 100.0%