**Problem Formulation**

**Challenge:** Given a list. How will you select a number randomly from the list using probability distribution?

When you select a number randomly from a list using a given probability distribution, the output number generated will be a number returned based on the relative weights (probability) of the given numbers. Let’s try to visualize this with the help of an example.

**Example: **

Given:numbers = [10, 20, 30] distributions = [0.3, 0.2, 0.5]Expected Output:Choose the elements randomly from the given list and display 5 elements in the output list: [30, 10, 20, 30, 30]Note:The output can vary.

The expected output has the number ’30’ three times since it has the highest weight/probability. The relative weights assigned are 0.3, 0.2 and 0.5, respectively. This means:

- Chances of selecting 10 are 30%.
- Chances of selecting 20 are 20%.
- Chances of selecting 30 are 50%.

**Note: **We will first have a look at the numerous ways of solving the given question and then dive into a couple of exercises for further clarity. So without further delay, let’s dive into our mission-critical question and solve it.

**Quick Video Explanation:**

**Method 1:** Using random.choices

`choices()`

is a method of the`random`

module in Python that returns a list containing randomly selected items from the specified sequence. This sequence can be a list, tuple, string, or any other kind of sequence.- The possibility to pick weights can be specified using the
`weights`

or the`cum_weights`

parameter.

Syntax:

random.choices(sequence, weights=None, cum_weights=None, k=1)

Parameter | Description |
---|---|

sequence | – It is a mandatory parameter. – Represents a sequence like a range of numbers, a list, a tuple, etc. |

weights | – It is an optional parameter. – Represents a list wherein the possibility for each value can be weighed. – By default, it is None. |

cum_weights | – It is an optional parameter. – Represents a list where the possibility for each value can be weighed. However, the possibility, in this case, is accumulated. For example: normal weights: `[2, 3, 5]` is equivalent to the cum_weights: `[2, 5, 10]` .– By default, it is None. |

k | – It is an optional parameter. – Represents an integer that determines the length of the returned list. |

**Approach: **Call the `random.choices()`

function and feed in the given list and the weights/probability distributions as parameters.

**Code:**

import random numbers = [10, 20, 30] distributions = [0.3, 0.2, 0.5] random_number = random.choices(numbers, distributions, k=5) print(random_number)

**Output:**

`[10, 30, 30, 10, 20]`

**Caution:**

- If the relative or cumulative weight is not specified, then the
`random.choices()`

function will automatically select elements with equal probability. - The specified weights should always be of the same length as the specified sequence.
- If you specify relative weights as well as cumulative weight at the same time, you will get a TypeError (
`TypeError: Cannot specify both weights and cumulative weights`

). Hence, to avoid the error, do not specify both at the same time. - The
`cum_weights`

or`weights`

can only be integers, floats, and fractions. They cannot be decimals. Also, you must ensure that the weights are non-negative.

**Method 2:** Using numpy.random.choice

Another way to sample a random number from a probability distribution is to use the `numpy.random.choice()`

function.

`choice()`

is a method of the `numpy.random`

module that allows you to generate a random value based on a numpy array. It accepts an array as a parameter and randomly returns one of the values from the array.

Syntax:numpy.random.choice(arr, k, p)

Parameter | Description |
---|---|

arr | – Represents the array containing the sequence of random numbers. |

k | – Represents an integer that determines the length of the returned list. |

p | – Represents a list where the possibility for each value can be weighed. In simple words, it is the probability distribution of each value of the given array. |

**Approach: **Use the `numpy.random.choice(li, size, replace, weights)`

function such that `replace`

is set to `True`

to return a list of the required `size`

from the list `li`

with respect to a list of corresponding weight sequences `weights`

.

**Code:**

import numpy as np numbers = [10, 20, 30] distributions = [0.3, 0.2, 0.5] random_number = np.random.choice(numbers, 5, True, distributions) print(random_number)

**Output:**

`[30 20 30 10 30]`

**Do you want to become a NumPy master?** Check out our interactive puzzle book **Coffee Break NumPy** and boost your data science skills! *(Amazon link opens in new tab.)*

**Method 3: Using Scipy**

`Scipy`

is another hand library to deal with random weighted distributions.

`rv_discrete`

is a base class that is used to construct specific distribution instances and classes for discrete random variables. It is also used to construct an arbitrary distribution defined by a list of support points and corresponding probabilities. [source: Official Documentation]

**Explanation: **In the following code snippet `rv_discrete()`

takes the sequence of integer values that are contained in the list `numbers`

as the first argument and the probability distributions/weights as the second argument and returns random values from the list based on their relative weigths/probability ditributions.

**Code:**

from scipy.stats import rv_discrete numbers = [10, 20, 30] distributions = [0.3, 0.2, 0.5] d = rv_discrete(values=(numbers, distributions)) print(d.rvs(size=5))

**Output:**

`[30 10 30 30 20]`

**Method 4: Using Lea**

Another effective Python library that helps us to work with probability distributions is **Lea**. It is specifically designed to facilitate you to model a wide range of random phenomenons, like coin tossing, gambling, It allows you to model a broad range of random phenomenons, like dice throwing, coin tossing, gambling results, weather forecast, finance, etc.

**#Note: **Since `lea`

is an external library, you must install it before using it. Here’s the command to install `lea`

in your system: `pip install lea`

**Code:**

import lea numbers = [10, 20, 30] distributions = [0.3, 0.2, 0.5] d = tuple(zip(numbers, distributions)) print(lea.pmf(d).random(5))

**Output:**

`(30, 30, 30, 10, 20)`

**Exercises**

**Question 1: ** Our friend Harry has eight coloured crayons: [“red”, “green”, “blue”, “yellow”, “black”, “white”, “pink”, “orange”]. Harry has the weighted preference for selecting each color as: [1/24, 1/6, 1/6, 1/12, 1/12, 1/24, 1/8, 7/24]. He is only allowed to select three colors at once. Find the various combinations he can select in 10 attempts.

**Solution:**

import random colors = ["red", "green", "blue", "yellow", "black", "white", "pink", "orange"] distributions = [1/24, 1/6, 1/6, 1/12, 1/12, 1/24, 1/8, 7/24] for i in range(10): choices = random.choices(colors, distributions, k=3) print(choices)

**Output:**

```
['orange', 'pink', 'green']
['blue', 'yellow', 'yellow']
['orange', 'green', 'black']
['blue', 'red', 'blue']
['orange', 'orange', 'red']
['orange', 'green', 'blue']
['orange', 'black', 'blue']
['black', 'yellow', 'green']
['pink', 'orange', 'orange']
['blue', 'blue', 'white']
```

**Question 2:**

Given:cities = ["Frankfurt", "Stuttgart", "Freiburg", "München", "Zürich", "Hamburg"] populations = [736000, 628000, 228000, 1450000, 409241, 1841179] The probability of a particular city being chosen depends on its population. Thus, larger the population of a city, higher the probability of the city being chosen. Based on this condition, find the probability distribution of the cities and display the city that might be selected in 10 attempts.

**Solution:**

import random cities = ["Frankfurt", "Stuttgart", "Freiburg", "München", "Zürich", "Hamburg"] populations = [736000, 628000, 228000, 1450000, 409241, 1841179] distributions = [round(pop / sum(populations), 2) for pop in populations] print(distributions) for i in range(10): print(random.choices(cities, distributions)[0])

**Output:**

```
[0.14, 0.12, 0.04, 0.27, 0.08, 0.35]
Freiburg
Frankfurt
Zürich
Hamburg
Stuttgart
Frankfurt
München
Frankfurt
München
München
```

With that we come to the end of this tutorial. I hope it has helped you. Please **subscribe** and stay tuned for more interesting tutorials and solutions. Happy learning! 🙂

**Recommended Read: Python’s Random Module – Everything You Need to Know to Get Started**