Weighted random selection from array
I would like to randomly select one element from an array, but each element has a known probability of selection.
All chances together (within the array) sums to 1.
What algorithm would you suggest as the fastest and most suitable for huge calculations?
Example:
id => chance
array[
0 => 0.8
1 => 0.2
]
for this pseudocode, the algorithm in question should on multiple calls statistically return four elements on id 0
for one element on id 1
.
Solution 1:
Compute the discrete cumulative density function (CDF) of your list -- or in simple terms the array of cumulative sums of the weights. Then generate a random number in the range between 0 and the sum of all weights (might be 1 in your case), do a binary search to find this random number in your discrete CDF array and get the value corresponding to this entry -- this is your weighted random number.
Solution 2:
The algorithm is straight forward
rand_no = rand(0,1)
for each element in array
if(rand_num < element.probablity)
select and break
rand_num = rand_num - element.probability
Solution 3:
I have found this article to be the most useful at understanding this problem fully. This stackoverflow question may also be what you're looking for.
I believe the optimal solution is to use the Alias Method (wikipedia). It requires O(n) time to initialize, O(1) time to make a selection, and O(n) memory.
Here is the algorithm for generating the result of rolling a weighted n-sided die (from here it is trivial to select an element from a length-n array) as take from this article.
The author assumes you have functions for rolling a fair die (floor(random() * n)
) and flipping a biased coin (random() < p
).
Algorithm: Vose's Alias Method
Initialization:
- Create arrays Alias and Prob, each of size n.
- Create two worklists, Small and Large.
- Multiply each probability by n.
- For each scaled probability pi:
- If pi < 1, add i to Small.
- Otherwise (pi ≥ 1), add i to Large.
- While Small and Large are not empty: (Large might be emptied first)
- Remove the first element from Small; call it l.
- Remove the first element from Large; call it g.
- Set Prob[l]=pl.
- Set Alias[l]=g.
- Set pg := (pg+pl)−1. (This is a more numerically stable option.)
- If pg<1, add g to Small.
- Otherwise (pg ≥ 1), add g to Large.
- While Large is not empty:
- Remove the first element from Large; call it g.
- Set Prob[g] = 1.
- While Small is not empty: This is only possible due to numerical instability.
- Remove the first element from Small; call it l.
- Set Prob[l] = 1.
Generation:
- Generate a fair die roll from an n-sided die; call the side i.
- Flip a biased coin that comes up heads with probability Prob[i].
- If the coin comes up "heads," return i.
- Otherwise, return Alias[i].