At ShowNearby we have been doing a very big migration to RoR 3.1 from PHP and we are facing several problems that may be some of you have solved before.

We have big amounts of data and we decided to segregate our DB into several DBs that we can handle separately. For example, our accounts, places, logs and others are split into several databases

We need to get migrations, fixtures, models, to play nicely, and so far it has been quite messy. Some of our requirements for a solution to be acceptable:

  • one model should relate to one tables in one of the databases.
  • rake db:drop - should drop all the database env we specify in database.yml
  • rake db:create - should create all the database env we specify in database.yml
  • rake db:migrate - should run migrations to the various databases
  • rake db:test - should grab fixtures and drop them into the various databases and test unit/function/etc

We are considering setting separate rails projects per each database and connecting them with ActiveResource, but we feel this is not very efficient. Have any of you deal with a similar problem before?


Solution 1:

To Wukerplank's answer, you can also put the connection details in database.yml like usual with a name like so:

log_database_production:
  adapter: mysql
  host: other_host
  username: logmein
  password: supersecret
  database: logs

Then in your special model:

class AccessLog < ActiveRecord::Base
  establish_connection "log_database_#{Rails.env}".to_sym
end

To keep those pesky credentials from being in your application code.

Edit: If you want to reuse this connection in multiple models, you should create a new abstract class and inherit from it, because connections are tightly coupled to classes (as explained here, here, and here), and new connections will be created for each class.

If that is the case, set things up like so:

class LogDatabase < ActiveRecord::Base
  self.abstract_class = true
  establish_connection "log_database_#{Rails.env}".to_sym
end

class AccessLog < LogDatabase
end

class CheckoutLog < LogDatabase
end

Solution 2:

Connecting to different databases is quite easy:

# model in the "default" database from database.yml
class Person < ActiveRecord::Base

  # ... your stuff here

end

# model in a different database
class Place < ActiveRecord::Base

  establish_connection (
    :adapter  => "mysql",
    :host     => "other_host",
    :username => "username",
    :password => "password",
    :database => "other_db"
  )

end

I would be wary of setting up multiple Rails projects as you will add a lot of overhead to data retrieval for your controllers, which could make things slow.

As for your questions about migrations, fixtures, models etc.: I don't think there will be an easy way, so please post separate questions and be as specific as you can.

Consolidating the DBs into one is not an option? It would make your life a lot easier!

Solution 3:

Found a great post that will point others to the right way of doing this check out http://blog.bitmelt.com/2008/10/connecting-to-multiple-database-in-ruby.html

Set it up something like this:

database.yml (db config file)

support_development:
    adapter: blah
    database: blah
    username: blah
    password: blah

support_base.rb (a model file)

class SupportBase < ActiveRecord::Base
    self.abstract_class = true #important!
    establish_connection("support_development")
end

tst_test.rb (a model file)

class TstTest < SupportBase 
    #SupportBase not ActiveRecord is important!

    self.table_name = 'tst_test'

    def self.get_test_name(id)
        if id = nil
            return ''
        else
            query = "select tst_name from tst_test where tst_id = \'#{id}\'"
            tst = connection.select_all(query) #select_all is important!
            return tst[0].fetch('tst_name')
        end
    end
end

PS, this really doesn't cover migrations, I don't think you can do migrations on more than one DB with rake (although I'm not sure that is a hard 'cannot do', it may be possible). This was just a great way to connect and query other DBs that you don't control.