Generating a list of random numbers, summing to 1
Solution 1:
The simplest solution is indeed to take N random values and divide by the sum.
A more generic solution is to use the Dirichlet distribution which is available in numpy.
By changing the parameters of the distribution you can change the "randomness" of individual numbers
>>> import numpy as np, numpy.random
>>> print np.random.dirichlet(np.ones(10),size=1)
[[ 0.01779975 0.14165316 0.01029262 0.168136 0.03061161 0.09046587
0.19987289 0.13398581 0.03119906 0.17598322]]
>>> print np.random.dirichlet(np.ones(10)/1000.,size=1)
[[ 2.63435230e-115 4.31961290e-209 1.41369771e-212 1.42417285e-188
0.00000000e+000 5.79841280e-143 0.00000000e+000 9.85329725e-005
9.99901467e-001 8.37460207e-246]]
>>> print np.random.dirichlet(np.ones(10)*1000.,size=1)
[[ 0.09967689 0.10151585 0.10077575 0.09875282 0.09935606 0.10093678
0.09517132 0.09891358 0.10206595 0.10283501]]
Depending on the main parameter the Dirichlet distribution will either give vectors where all the values are close to 1./N where N is the length of the vector, or give vectors where most of the values of the vectors will be ~0 , and there will be a single 1, or give something in between those possibilities.
EDIT (5 years after the original answer): Another useful fact about the Dirichlet distribution is that you naturally get it, if you generate a Gamma-distributed set of random variables and then divide them by their sum.
Solution 2:
The best way to do this is to simply make a list of as many numbers as you wish, then divide them all by the sum. They are totally random this way.
r = [ran.random() for i in range(1,100)]
s = sum(r)
r = [ i/s for i in r ]
or, as suggested by @TomKealy, keep the sum and creation in one loop:
rs = []
s = 0
for i in range(100):
r = ran.random()
s += r
rs.append(r)
For the fastest performance, use numpy
:
import numpy as np
a = np.random.random(100)
a /= a.sum()
And you can give the random numbers any distribution you want, for a probability distribution:
a = np.random.normal(size=100)
a /= a.sum()
---- Timing ----
In [52]: %%timeit
...: r = [ran.random() for i in range(1,100)]
...: s = sum(r)
...: r = [ i/s for i in r ]
....:
1000 loops, best of 3: 231 µs per loop
In [53]: %%timeit
....: rs = []
....: s = 0
....: for i in range(100):
....: r = ran.random()
....: s += r
....: rs.append(r)
....:
10000 loops, best of 3: 39.9 µs per loop
In [54]: %%timeit
....: a = np.random.random(100)
....: a /= a.sum()
....:
10000 loops, best of 3: 21.8 µs per loop
Solution 3:
Dividing each number by the total may not give you the distribution you want. For example, with two numbers, the pair x,y = random.random(), random.random() picks a point uniformly on the square 0<=x<1, 0<=y<1. Dividing by the sum "projects" that point (x,y) onto the line x+y=1 along the line from (x,y) to the origin. Points near (0.5,0.5) will be much more likely than points near (0.1,0.9).
For two variables, then, x = random.random(), y=1-x gives a uniform distribution along the geometrical line segment.
With 3 variables, you are picking a random point in a cube and projecting (radially, through the origin), but points near the center of the triangle will be more likely than points near the vertices. The resulting points are on a triangle in the x+y+z plane. If you need unbiased choice of points in that triangle, scaling is no good.
The problem gets complicated in n-dimensions, but you can get a low-precision (but high accuracy, for all you laboratory science fans!) estimate by picking uniformly from the set of all n-tuples of non-negative integers adding up to N, and then dividing each of them by N.
I recently came up with an algorithm to do that for modest-sized n, N. It should work for n=100 and N = 1,000,000 to give you 6-digit randoms. See my answer at:
Create constrained random numbers?