Best way to choose a random file from a directory

Solution 1:

import os, random
random.choice(os.listdir("C:\\")) #change dir name to whatever

Regarding your edited question: first, I assume you know the risks of using a dircache, as well as the fact that it is deprecated since 2.6, and removed in 3.0.

Second of all, I don't see where any race condition exists here. Your dircache object is basically immutable (after directory listing is cached, it is never read again), so no harm in concurrent reads from it.

Other than that, I do not understand why you see any problem with this solution. It is fine.

Solution 2:

If you want directories included, Yuval A's answer. Otherwise:

import os, random

random.choice([x for x in os.listdir("C:\\") if os.path.isfile(os.path.join("C:\\", x))])

Solution 3:

The problem with most of the solutions given is you load all your input into memory which can become a problem for large inputs/hierarchies. Here's a solution adapted from The Perl Cookbook by Tom Christiansen and Nat Torkington. To get a random file anywhere beneath a directory:

#! /usr/bin/env python
import os, random
n=0
random.seed();
for root, dirs, files in os.walk('/tmp/foo'):
  for name in files:
    n += 1
    if random.uniform(0, n) < 1:
        rfile=os.path.join(root, name)
print rfile

Generalizing a bit makes a handy script:

$ cat /tmp/randy.py
#! /usr/bin/env python
import sys, random
random.seed()
n = 1
for line in sys.stdin:
  if random.uniform(0, n) < 1:
      rline=line
  n += 1
sys.stdout.write(rline)

$ /tmp/randy.py < /usr/share/dict/words 
chrysochlore

$ find /tmp/foo -type f | /tmp/randy.py
/tmp/foo/bar