# Sample a Random Number from a Probability Distribution in Python

5/5 - (1 vote)

## Problem Formulation

Challenge: Given a list. How will you select a number randomly from the list using probability distribution?

When you select a number randomly from a list using a given probability distribution, the output number generated will be a number returned based on the relative weights (probability) of the given numbers. Let’s try to visualize this with the help of an example.

Example:

```Given:
numbers = [10, 20, 30]
distributions = [0.3, 0.2, 0.5]

Expected Output: Choose the elements randomly from the given list and display 5 elements in the output list:
[30, 10, 20, 30, 30]

Note: The output can vary.```

The expected output has the number ’30’ three times since it has the highest weight/probability. The relative weights assigned are 0.3, 0.2 and 0.5, respectively. This means:

• Chances of selecting 10 are 30%.
• Chances of selecting 20 are 20%.
• Chances of selecting 30 are 50%.

Note: We will first have a look at the numerous ways of solving the given question and then dive into a couple of exercises for further clarity. So without further delay, let’s dive into our mission-critical question and solve it.

## Method 1: Using random.choices

• `choices()` is a method of the `random` module in Python that returns a list containing randomly selected items from the specified sequence. This sequence can be a list, tuple, string, or any other kind of sequence.
• The possibility to pick weights can be specified using the `weights` or the `cum_weights` parameter.
`Syntax:random.choices(sequence, weights=None, cum_weights=None, k=1)`

Approach: Call the `random.choices()` function and feed in the given list and the weights/probability distributions as parameters.

Code:

```import random
numbers = [10, 20, 30]
distributions = [0.3, 0.2, 0.5]
random_number = random.choices(numbers, distributions, k=5)
print(random_number)```

Output:

``[10, 30, 30, 10, 20]``

Caution:

• If the relative or cumulative weight is not specified, then the `random.choices()` function will automatically select elements with equal probability.
• The specified weights should always be of the same length as the specified sequence.
• If you specify relative weights as well as cumulative weight at the same time, you will get a TypeError (`TypeError: Cannot specify both weights and cumulative weights`). Hence, to avoid the error, do not specify both at the same time.
• The `cum_weights` or `weights` can only be integers, floats, and fractions. They cannot be decimals. Also, you must ensure that the weights are non-negative.

## Method 2: Using numpy.random.choice

Another way to sample a random number from a probability distribution is to use the `numpy.random.choice()` function.

`choice()` is a method of the `numpy.random` module that allows you to generate a random value based on a numpy array. It accepts an array as a parameter and randomly returns one of the values from the array.

```Syntax:
numpy.random.choice(arr, k, p)```

Approach: Use the `numpy.random.choice(li, size, replace, weights)` function such that `replace` is set to `True` to return a list of the required `size` from the list `li` with respect to a list of corresponding weight sequences `weights`.

Code:

```import numpy as np
numbers = [10, 20, 30]
distributions = [0.3, 0.2, 0.5]
random_number = np.random.choice(numbers, 5, True, distributions)
print(random_number)```

Output:

``[30 20 30 10 30]``

Do you want to become a NumPy master? Check out our interactive puzzle book Coffee Break NumPy and boost your data science skills! (Amazon link opens in new tab.)

## Method 3: Using Scipy

`Scipy` is another hand library to deal with random weighted distributions.

• `rv_discrete` is a base class that is used to construct specific distribution instances and classes for discrete random variables. It is also used to construct an arbitrary distribution defined by a list of support points and corresponding probabilities. [source: Official Documentation]

Explanation: In the following code snippet `rv_discrete()` takes the sequence of integer values that are contained in the list `numbers` as the first argument and the probability distributions/weights as the second argument and returns random values from the list based on their relative weigths/probability ditributions.

Code:

```from scipy.stats import rv_discrete
numbers = [10, 20, 30]
distributions = [0.3, 0.2, 0.5]
d = rv_discrete(values=(numbers, distributions))
print(d.rvs(size=5))```

Output:

``[30 10 30 30 20]``

## Method 4: Using Lea

Another effective Python library that helps us to work with probability distributions is Lea. It is specifically designed to facilitate you to model a wide range of random phenomenons, like coin tossing, gambling, It allows you to model a broad range of random phenomenons, like dice throwing, coin tossing, gambling results, weather forecast, finance, etc.

#Note: Since `lea` is an external library, you must install it before using it. Here’s the command to install `lea` in your system: `pip install lea`

Code:

```import lea

numbers = [10, 20, 30]
distributions = [0.3, 0.2, 0.5]
d = tuple(zip(numbers, distributions))
print(lea.pmf(d).random(5))```

Output:

``(30, 30, 30, 10, 20)``

## Exercises

Question 1: Our friend Harry has eight coloured crayons: [“red”, “green”, “blue”, “yellow”, “black”, “white”, “pink”, “orange”]. Harry has the weighted preference for selecting each color as: [1/24, 1/6, 1/6, 1/12, 1/12, 1/24, 1/8, 7/24]. He is only allowed to select three colors at once. Find the various combinations he can select in 10 attempts.

Solution:

```import random
colors = ["red", "green", "blue", "yellow", "black", "white", "pink", "orange"]
distributions = [1/24, 1/6, 1/6, 1/12, 1/12, 1/24, 1/8, 7/24]
for i in range(10):
choices = random.choices(colors, distributions, k=3)
print(choices)```

Output:

``````['orange', 'pink', 'green']
['blue', 'yellow', 'yellow']
['orange', 'green', 'black']
['blue', 'red', 'blue']
['orange', 'orange', 'red']
['orange', 'green', 'blue']
['orange', 'black', 'blue']
['black', 'yellow', 'green']
['pink', 'orange', 'orange']
['blue', 'blue', 'white']``````

Question 2:

```Given:
cities = ["Frankfurt", "Stuttgart", "Freiburg", "München", "Zürich", "Hamburg"]
populations = [736000, 628000, 228000, 1450000, 409241, 1841179]

The probability of a particular city being chosen depends on its population. Thus, larger the population of a city, higher the probability of the city being chosen. Based on this condition, find the probability distribution of the cities and display the city that might be selected in 10 attempts. ```

Solution:

```import random
cities = ["Frankfurt", "Stuttgart", "Freiburg", "München", "Zürich", "Hamburg"]
populations = [736000, 628000, 228000, 1450000, 409241, 1841179]
distributions = [round(pop / sum(populations), 2) for pop in populations]
print(distributions)
for i in range(10):
print(random.choices(cities, distributions))```

Output:

``````[0.14, 0.12, 0.04, 0.27, 0.08, 0.35]
Freiburg
Frankfurt
Zürich
Hamburg
Stuttgart
Frankfurt
München
Frankfurt
München
München``````

With that we come to the end of this tutorial. I hope it has helped you. Please subscribe and stay tuned for more interesting tutorials and solutions. Happy learning! 🙂

Recommended Read: Python’s Random Module – Everything You Need to Know to Get Started