"Normalize" values to sum 1 but keeping their weights
I am not really sure what this operation might be called, but I have some numbers, for example:
40
10
I need to format these numbers so that they form the sum 1, but they should keep their "weight".
In this specific case
40 would become 0.80, and 10 would become 0.2
But if I have more values (like 40, 10, 25, 5 for example), I am really lost because I don't know the formula.
If anybody can help, could he please reply in words (for example: "Sum up all values then divide through...", and not in a formula? I am really not good at reading formulas at all.
Thank you so much!
Why not just divide each number in your sample by the sum of all the numbers in your sample?
The answers provided here won't work in cases where the set contains negative numbers.
The function you're looking for is called the softmax function. The softmax function is often used in the final layer of a neural network-based classifier. Softmax is defined as:
$$f_i(x) = \frac{e^{x_i}}{\sum_j e^{x_j}}$$
From the text description, it seems this is what you want:
- calculate the sum of all elements
- divide each element by the sum
Note that, however, then your example $[40, 10]$ normalises as $[0.8, 0.2]$, not $[0.75,0.25]$. The latter doesn't preserve the ratio of both elements.
I'm currently learning probability from an online course.
It's teaching me to use this amazingly simple method, and so far it has worked flawlessly!
Where e
is an element in the list of numbers to be normalized:
Calculate a normalizer (multiplier) like so:
normalizer = 1 / (e1 + e2 + e3)
Next, multiply the normalizer to every element in the list:
((e1 * normalizer) + (e2 * normalizer) + .... + (en * normalizer) ) == 1.0
... and they will add up to 1.0.
So taking your example of numbers 10 and 40:
normalizer = 1 / (10 + 40) = 0.02
(10 * 0.02) = 0.2
(40 * 0.02) = 0.8
(0.2 + 0.8) = 1.0
Hence we get our 1.0.
I've gone ahead and written a simple Python script that you can run on almost any Python interpreter and play around with the parameters and test the results.
# python script
import random as rnd
# number of items in list, change this to as huge a list as you want
itemsInList = 5
# specify min and max value bounds for randomly generated values
# change these to play around with different value ranges
minVal = 8
maxVal = 20
# creates a list of random values between minVal and maxVal, and sort them
numList = sorted( [rnd.randint(minVal, maxVal) for x in range(itemsInList)] )
print ('initial list is\n{}\n'.format(numList))
# calculate the normalizer, using: (1 / (sum_of_all_items_in_list))
normalizer = 1 / float( sum(numList) )
# multiply each item by the normalizer
numListNormalized = [x * normalizer for x in numList]
print('Normalized list is\n{}\n'.format(numListNormalized))
print('Sum of all items in numListNormalized is {}'.format(sum(numListNormalized)))
Running the code gives this sample output:
initial list is
[9, 12, 15, 16, 19]
Normalized list is
[0.1267605633802817, 0.16901408450704225, 0.2112676056338028, 0.22535211267605634, 0.26760563380281693]
Sum of all items in numListNormalized is 1.0
I hope this helps!