In this tutorial, you’ll learn how to generate synthetic data that follows a power-law distribution, plot its cumulative distribution function (CDF), and fit a power-law curve to this CDF using Python. This process is useful for analyzing datasets that follow power-law distributions, which are common in natural and social phenomena.
Prerequisites
Ensure you have Python installed, along with the numpy, matplotlib, and scipy libraries. If not, you can install them using pip:
pip install numpy matplotlib scipy
Step 1: Generate Power-law Distributed Data
First, we’ll generate a dataset that follows a power-law distribution using numpy.
import numpy as np # Parameters alpha = 3.0 # Exponent of the distribution size = 1000 # Number of data points # Generate power-law distributed data data = np.random.power(a=alpha, size=size)
π How to Generate and Plot Random Samples from a Power-Law Distribution in Python?
The data looks like this:

Let’s make some sense out of it and plot it in 2D space: π
Step 2: Plot the Cumulative Distribution Function (CDF)
Next, we’ll plot the CDF of the generated data on a log-log scale to visualize its power-law distribution.
import matplotlib.pyplot as plt
# Prepare data for the CDF plot
sorted_data = np.sort(data)
yvals = np.arange(1, len(sorted_data) + 1) / float(len(sorted_data))
# Plot the CDF
plt.plot(sorted_data, yvals, marker='.', linestyle='none', color='blue')
plt.xlabel('Value')
plt.ylabel('Cumulative Frequency')
plt.title('CDF of Power-law Distributed Data')
plt.xscale('log')
plt.yscale('log')
plt.grid(True, which="both", ls="--")
plt.show()The plot:

Step 3: Fit a Power-law Curve to the CDF
To understand the underlying power-law distribution better, we fit a curve to the CDF using the curve_fit function from scipy.optimize.
from scipy.optimize import curve_fit
# Power-law fitting function
def power_law_fit(x, a, b):
return a * np.power(x, b)
# Fit the power-law curve
params, covariance = curve_fit(power_law_fit, sorted_data, yvals)
# Generate fitted values
fitted_yvals = power_law_fit(sorted_data, *params)Step 4: Plot the Fitted Curve with the CDF
Finally, we’ll overlay the fitted power-law curve on the original CDF plot to visually assess the fit.
# Plot the original CDF and the fitted power-law curve
plt.plot(sorted_data, yvals, marker='.', linestyle='none', color='blue', label='Original Data')
plt.plot(sorted_data, fitted_yvals, 'r-', label='Fitted Power-law Curve')
plt.xlabel('Value')
plt.ylabel('Cumulative Frequency')
plt.title('CDF with Fitted Power-law Curve')
plt.xscale('log')
plt.yscale('log')
plt.grid(True, which="both", ls="--")
plt.legend()
plt.show()VoilΓ ! π

This visualization helps in assessing the accuracy of the power-law model in describing the distribution of the data.
Recommended article:

π Visualizing Wealth: Plotting the Net Worth of the Worldβs Richest in Log/Log Space