Rails 3: Get Random Record
So, I've found several examples for finding a random record in Rails 2 -- the preferred method seems to be:
Thing.find :first, :offset => rand(Thing.count)
Being something of a newbie I'm not sure how this could be constructed using the new find syntax in Rails 3.
So, what's the "Rails 3 Way" to find a random record?
Solution 1:
Thing.first(:order => "RANDOM()") # For MySQL :order => "RAND()", - thanx, @DanSingerman
# Rails 3
Thing.order("RANDOM()").first
or
Thing.first(:offset => rand(Thing.count))
# Rails 3
Thing.offset(rand(Thing.count)).first
Actually, in Rails 3 all examples will work. But using order RANDOM
is quite slow for big tables but more sql-style
UPD. You can use the following trick on an indexed column (PostgreSQL syntax):
select *
from my_table
where id >= trunc(
random() * (select max(id) from my_table) + 1
)
order by id
limit 1;
Solution 2:
I am working on a project (Rails 3.0.15, ruby 1.9.3-p125-perf) where the db is in localhost and users table has a bit more than 100K records.
Using
order by RAND()
is quite slow
User.order("RAND(id)").first
becomes
SELECT
users
.* FROMusers
ORDER BY RAND(id) LIMIT 1
and takes from 8 to 12 seconds to respond!!
Rails log:
User Load (11030.8ms) SELECT
users
.* FROMusers
ORDER BY RAND() LIMIT 1
from mysql's explain
+----+-------------+-------+------+---------------+------+---------+------+--------+---------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+------+--------+---------------------------------+
| 1 | SIMPLE | users | ALL | NULL | NULL | NULL | NULL | 110165 | Using temporary; Using filesort |
+----+-------------+-------+------+---------------+------+---------+------+--------+---------------------------------+
You can see that no index is used (possible_keys = NULL), a temporary table is created and an extra pass is required to fetch the desired value (extra = Using temporary; Using filesort).
On the other hand, by splitting the query in two parts and using Ruby, we have a reasonable improvement in response time.
users = User.scoped.select(:id);nil
User.find( users.first( Random.rand( users.length )).last )
(;nil for console use)
Rails log:
User Load (25.2ms) SELECT id FROM
users
User Load (0.2ms) SELECTusers
.* FROMusers
WHEREusers
.id
= 106854 LIMIT 1
and mysql's explain proves why:
+----+-------------+-------+-------+---------------+--------------------------+---------+------+--------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+--------------------------+---------+------+--------+-------------+
| 1 | SIMPLE | users | index | NULL | index_users_on_user_type | 2 | NULL | 110165 | Using index |
+----+-------------+-------+-------+---------------+--------------------------+---------+------+--------+-------------+
+----+-------------+-------+-------+---------------+---------+---------+-------+------+-------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+---------+---------+-------+------+-------+
| 1 | SIMPLE | users | const | PRIMARY | PRIMARY | 4 | const | 1 | |
+----+-------------+-------+-------+---------------+---------+---------+-------+------+-------+
we can now use only indexes and the primary key and do the job about 500 times faster!
UPDATE:
as pointed out by icantbecool in comments the above solution has a flaw if there are deleted records in the table.
A workaround in that can be
users_count = User.count
User.scoped.limit(1).offset(rand(users_count)).first
which translates to two queries
SELECT COUNT(*) FROM `users`
SELECT `users`.* FROM `users` LIMIT 1 OFFSET 148794
and runs in about 500ms.
Solution 3:
If using Postgres
User.limit(5).order("RANDOM()")
If using MySQL
User.limit(5).order("RAND()")
In both instances you're selecting 5 records randomly from the Users table. Here is the actual SQL query in displayed in the console.
SELECT * FROM users ORDER BY RANDOM() LIMIT 5
Solution 4:
I made a rails 3 gem for doing this that performs better on large tables and allows you to chain relations and scopes:
https://github.com/spilliton/randumb
(edit): The default behavior of my gem basically uses the same approach as above now, but you have the option to use the old way if you want :)
Solution 5:
Many of the answers posted actually won't perform well on rather large tables (1+ million rows). Random ordering quickly takes a few seconds, and doing a count on the table also takes quite long.
A solution that works well for me in this situation is to use RANDOM()
with a where condition:
Thing.where('RANDOM() >= 0.9').take
On a table with over a million rows, this query generally takes less than 2ms.