Do you want to choose single or multiple elements from the list **randomly with a different probability**? Do you want to write the weighted version of random.choice()? If Yes, The examples mentioned in this article will help you to make weighted random choices in Python.

**Further reading**

Let’s take the following example for a better understanding of the requirement.

```
import random
sampleList = [10, 20, 30, 40]
x = random.choice(sampleList)
print(x)
```

If you execute the above code, it will give you either 10, 20, 30, or 40 with equal probability. But what if you want to pick the element from the list with the different probability. Like, choose k sized list of elements from any sequence in such a way that each element of the sequence has a different probability of being selected.

In other words, choose 4 elements from the list randomly with different probabilities. For example:

Choose 10 – 10% of the time

Choose 20 – 25% of the time

Choose 30 – 50% of the time

Choose 40 – 15% of the time

**There are 2 ways to make weighted random choices in Python**

- If you are using Python 3.6 or above then use
**random.choices()** - Else, use
`numpy.random.choice()`

We will see how to use both on by one.

**random.choices()**

Python 3.6 introduced a new function choices() in the random module. By using `random.choices()`

we can make a weighted random choice with replacement. You can also call it a weighted random sample with replacement. Let’s have a look into the syntax of this function.

`random.choices(population, weights=None, *, cum_weights=None, k=1)`

- The
`random.choices()`

return a k sized list of elements chosen from the population with replacement `weights`

or`cum_weights`

are used to define the selection probability for each element- If a weights sequence is specified, random selections are made according to the relative weights
- Alternatively, if a cum_weights sequence is given, the random selections are made according to the cumulative weights
- Note: You cannot specify both weights and cumulative weights.

First, define the probability for each element. As mentioned above we can define weights sequence using the following two ways

**Relative weights****Cumulative weights**

**Relative weights to choose elements from the list with different probability**

If you specified the relative weight, the selections are made according to the relative weights. You can specify relative weights using **weight** parameter

**Example**: **Choose 5 elements from the list with different probability**

```
import random
numberList = [111, 222, 333, 444, 555]
print(random.choices(numberList, weights=(10, 20, 30, 40, 50), k=5))
```

Output:

[555, 222, 555, 222, 555]

- As you can see in the output, we got
*555*three times because we specified the highest weight for it. So it has the highest probability to be selected - We specified K=5 to choose 5 elements, you can specify any number you want. for example, to choose only 1 element you can specify k=1.
- Weights sum is not 100 because they’re relative weights, not percentages.

The weighted probability of selecting each element is determined by the following rule.

**Probability = element_weight/ sum of all weights**

In the above example, the probability of occurring each element is determined is as follows

The total weight is 10+20+30+40+50 = 150 List is [111, 222, 333, 444, 555] It returns 111 with probability 10/150 = 0.06 It returns 222 with probability 20/150 = 0.13 It returns 333 with probability 30/150 = 0.20 It returns 444 with probability 40/150 = 0.26 It returns 555 with probability 50/150 = 0.33

**Cumulative weights to choose elements from the list with different probability**

Alternatively, if a cum_weights sequence is given, the selections are made according to the cumulative weights. You can specify cumulative weights using the **cum_weights** parameter.

**Note: **Python converts the relative weights to cumulative weights before making selections. So, I suggest you pass cumulative weights to saves time and extra work

The cumulative weight of each element is determined by using the following formula.

**cum_weight**= Weight of previous element + own weight.

For example, the relative weights [5, 10, 15, 20] are equivalent to the cumulative weights [5, 15, 30, 50].

**Example: Choose 4 elements from the list with different probability**

```
import random
nameList = ["Kelly", "Scott", "Emma", "Jon"]
print(random.choices(nameList, cum_weights=(5, 15, 30, 50), k=4))
```

Output:

['Jon', 'Kelly', 'Jon', 'Scott']

### Choose a single element form list with different probability

```
import random
numberList = ["Kelly", "Scott", "Emma", "Jon"]
for i in range(5):
print("Iteration:", i, "Random choice is ")
randomItem = random.choices(numberList, cum_weights=(5, 15, 30, 50), k=1)
print(randomItem[0])
```

Output:

Iteration: 0 Random choice is Jon Iteration: 1 Random choice is Jon Iteration: 2 Random choice is Scott Iteration: 3 Random choice is Scott Iteration: 4 Random choice is Emma

**Note**: As you can see we got “Jon” 3 times because it has the highest probability of being selected

### Probability of getting 6 or more heads from 10 spins

```
import random
# we specified head and tail of a coin in string
coin = "HT"
# Execute 3 times to verify we are getting 6 or more heads in every 10 spins every time
for i in range(3):
print(random.choices(coin, cum_weights=(0.61, 1.00), k=10))
```

Output:

['H', 'H', 'H', 'H', 'H', 'H', 'H', 'T', 'H', 'T'] ['H', 'T', 'H', 'H', 'H', 'T', 'H', 'H', 'H', 'H'] ['H', 'T', 'T', 'T', 'H', 'T', 'H', 'H', 'H', 'H']

**Note**: We used cumulative weights. So the probability of getting head of a coin is 0.61, and the tail of a coin is 0.39 (1 – 0.61 = 0.39)

## Generate weighted random numbers

Given a range of integers, we want to generate five random numbers based on the weight. We need to specify the probability/weight for each number to be selected. Let’s see how to generate random numbers with a given (numerical) distribution.

```
import random
# range will generate 6 numbers specify weight for all 6 possible numbers
randomList = random.choices(range(10, 40, 5), cum_weights=(5, 15, 10, 25, 40, 65), k=6)
print("random numbers with a weighted probability")
print(randomList)
```

Output:

random numbers with a weighted probability [35, 15, 30, 15, 35, 30]

## Points to remember before implementing weighted random choices

- If you don’t specify the relative or cumulative weight, random.choices() will choose elements with equal probability.
- The specified weights sequence must be of the same length as the population sequence.
- Don’t specify relative weights and cumulative weight at the same time to avoid Type Error.(TypeError: Cannot specify both weights and cumulative weights)
- You can specify The weights or cum_weights only as integers, floats, and fractions but excludes decimals.
- Weights must be non-negative.

## Numpy’s random.choice() to choose elements from the list with different probability

If you are using python version less than 3.6, then you can use the NumPy library to make weighted random choices.

`pip install numpy`

Using a `numpy.random.choice()`

you can specify the probability distribution.

`numpy.random.choice(a, size=None, replace=True, p=None)`

`a`

is the population from which you want to choose elements. for example, list.- size is nothing but the number of elements you want to choose.
- p is used to specify the probability for each element to be selected.

**Note**: probabilities must sum to 1, i.e., we specify probability for each element, the sum of all those numbers must be 1.

**Example**:

```
from numpy.random import choice
numberList = [100, 200, 300, 400]
# Numpy's random.choice() to choose elements with different probabilities
sampleNumbers = choice(numberList, 4, p=[0.10, 0.20, 0.30, 0.40])
print(sampleNumbers)
```

Output:

[400 400 300 200]

## Next Steps

- Solve the Python Random data generation Exercise to practice and master the random data generation techniques.
- Solve our Python Random data generation Quiz to test your random data generation concepts.