MySQL select 10 random rows from 600K rows fast

How can I best write a query that selects 10 rows randomly from a total of 600k?


A great post handling several cases, from simple, to gaps, to non-uniform with gaps.

http://jan.kneschke.de/projects/mysql/order-by-rand/

For most general case, here is how you do it:

SELECT name
  FROM random AS r1 JOIN
       (SELECT CEIL(RAND() *
                     (SELECT MAX(id)
                        FROM random)) AS id)
        AS r2
 WHERE r1.id >= r2.id
 ORDER BY r1.id ASC
 LIMIT 1

This supposes that the distribution of ids is equal, and that there can be gaps in the id list. See the article for more advanced examples


SELECT column FROM table
ORDER BY RAND()
LIMIT 10

Not the efficient solution but works


Simple query that has excellent performance and works with gaps:

SELECT * FROM tbl AS t1 JOIN (SELECT id FROM tbl ORDER BY RAND() LIMIT 10) as t2 ON t1.id=t2.id

This query on a 200K table takes 0.08s and the normal version (SELECT * FROM tbl ORDER BY RAND() LIMIT 10) takes 0.35s on my machine.

This is fast because the sort phase only uses the indexed ID column. You can see this behaviour in the explain:

SELECT * FROM tbl ORDER BY RAND() LIMIT 10: Simple Explain

SELECT * FROM tbl AS t1 JOIN (SELECT id FROM tbl ORDER BY RAND() LIMIT 10) as t2 ON t1.id=t2.id enter image description here

Weighted Version: https://stackoverflow.com/a/41577458/893432


I am getting fast queries (around 0.5 seconds) with a slow cpu, selecting 10 random rows in a 400K registers MySQL database non-cached 2Gb size. See here my code: Fast selection of random rows in MySQL

$time= microtime_float();

$sql='SELECT COUNT(*) FROM pages';
$rquery= BD_Ejecutar($sql);
list($num_records)=mysql_fetch_row($rquery);
mysql_free_result($rquery);

$sql="SELECT id FROM pages WHERE RAND()*$num_records<20
   ORDER BY RAND() LIMIT 0,10";
$rquery= BD_Ejecutar($sql);
while(list($id)=mysql_fetch_row($rquery)){
    if($id_in) $id_in.=",$id";
    else $id_in="$id";
}
mysql_free_result($rquery);

$sql="SELECT id,url FROM pages WHERE id IN($id_in)";
$rquery= BD_Ejecutar($sql);
while(list($id,$url)=mysql_fetch_row($rquery)){
    logger("$id, $url",1);
}
mysql_free_result($rquery);

$time= microtime_float()-$time;

logger("num_records=$num_records",1);
logger("$id_in",1);
logger("Time elapsed: <b>$time segundos</b>",1);