Do you want to choose single or multiple elements from the list randomly with a different probability? Do you want to write the weighted version of random.choice()? If Yes, The examples mentioned in this article will help you to make weighted random choices in Python.
Let’s take the following example for a better understanding of the requirement.
import random sampleList = [10, 20, 30, 40] x = random.choice(sampleList) print(x)
If you execute the above code, it will give you either 10, 20, 30, or 40 with equal probability. But what if you want to pick the element from the list with the different probability. Like, choose k sized list of elements from any sequence in such a way that each element of the sequence has a different probability of being selected.
There are 2 ways to make weighted random choices in Python
- If you are using Python 3.6 or above then use random.choices()
- Else, use
We will see how to use both on by one.
Python 3.6 introduced a new function choices() in the random module. By using
random.choices() we can make a weighted random choice with replacement. You can also call it a weighted random sample with replacement. Let’s have a look into the syntax of this function.
random.choices(population, weights=None, *, cum_weights=None, k=1)
random.choices()return a k sized list of elements chosen from the population with replacement
cum_weightsare used to define the selection probability for each element
- If a weights sequence is specified, random selections are made according to the relative weights
- Alternatively, if a cum_weights sequence is given, the random selections are made according to the cumulative weights
- Note: You cannot specify both weights and cumulative weights.
First, define the probability for each element. As mentioned above we can define weights sequence using the following two ways
- Relative weights
- Cumulative weights
Relative weights to choose elements from the list with different probability
If you specified the relative weight, the selections are made according to the relative weights. You can specify relative weights using weight parameter
Example: Choose 5 elements from the list with different probability
import random numberList = [111, 222, 333, 444, 555] print(random.choices(numberList, weights=(10, 20, 30, 40, 50), k=5))
[555, 222, 555, 222, 555]
- As you can see in the output, we got 555 three times because we specified the highest weight for it. So it has the highest probability to be selected
- We specified K=5 to choose 5 elements, you can specify any number you want. for example, to choose only 1 element you can specify k=1.
- Weights sum is not 100 because they’re relative weights, not percentages.
Cumulative weights to choose elements from the list with different probability
Alternatively, if a cum_weights sequence is given, the selections are made according to the cumulative weights. You can specify cumulative weights using the cum_weights parameter.
Note: Python converts the relative weights to cumulative weights before making selections. So, I suggest you pass cumulative weights to saves time and extra work
Example: Choose 4 elements from the list with different probability
import random nameList = ["Kelly", "Scott", "Emma", "Jon"] print(random.choices(nameList, cum_weights=(5, 15, 30, 50), k=4))
['Jon', 'Kelly', 'Jon', 'Scott']
Choose a single element form list with different probability
import random numberList = ["Kelly", "Scott", "Emma", "Jon"] for i in range(5): print("Iteration:", i, "Random choice is ") randomItem = random.choices(numberList, cum_weights=(5, 15, 30, 50), k=1) print(randomItem)
Iteration: 0 Random choice is Jon Iteration: 1 Random choice is Jon Iteration: 2 Random choice is Scott Iteration: 3 Random choice is Scott Iteration: 4 Random choice is Emma
Note: As you can see we got “Jon” 3 times because it has the highest probability of being selected
Probability of getting 6 or more heads from 10 spins
import random # we specified head and tail of a coin in string coin = "HT" # Execute 3 times to verify we are getting 6 or more heads in every 10 spins every time for i in range(3): print(random.choices(coin, cum_weights=(0.61, 1.00), k=10))
['H', 'H', 'H', 'H', 'H', 'H', 'H', 'T', 'H', 'T'] ['H', 'T', 'H', 'H', 'H', 'T', 'H', 'H', 'H', 'H'] ['H', 'T', 'T', 'T', 'H', 'T', 'H', 'H', 'H', 'H']
Note: We used cumulative weights. So the probability of getting head of a coin is 0.61, and the tail of a coin is 0.39 (1 – 0.61 = 0.39)
Generate weighted random numbers
Given a range of integers, we want to generate five random numbers based on the weight. We need to specify the probability/weight for each number to be selected. Let’s see how to generate random numbers with a given (numerical) distribution.
import random # range will generate 6 numbers specify weight for all 6 possible numbers randomList = random.choices(range(10, 40, 5), cum_weights=(5, 15, 10, 25, 40, 65), k=6) print("random numbers with a weighted probability") print(randomList)
random numbers with a weighted probability [35, 15, 30, 15, 35, 30]
Points to remember before implementing weighted random choices
- If you don’t specify the relative or cumulative weight, random.choices() will choose elements with equal probability.
- The specified weights sequence must be of the same length as the population sequence.
- Don’t specify relative weights and cumulative weight at the same time to avoid Type Error.(TypeError: Cannot specify both weights and cumulative weights)
- You can specify The weights or cum_weights only as integers, floats, and fractions but excludes decimals.
- Weights must be non-negative.
Numpy’s random.choice() to choose elements from the list with different probability
If you are using python version less than 3.6, then you can use the NumPy library to make weighted random choices.
pip install numpy
numpy.random.choice() you can specify the probability distribution.
numpy.random.choice(a, size=None, replace=True, p=None)
ais the population from which you want to choose elements. for example, list.
- size is nothing but the number of elements you want to choose.
- p is used to specify the probability for each element to be selected.
from numpy.random import choice numberList = [100, 200, 300, 400] # Numpy's random.choice() to choose elements with different probabilities sampleNumbers = choice(numberList, 4, p=[0.10, 0.20, 0.30, 0.40]) print(sampleNumbers)
[400 400 300 200]