Statistics Archives - Be on the Right Side of Change

Your Degree Won’t Save You: 3 Painfully Simple Steps to Escape the Knowledge-Worker Collapse

October 23, 2025 by Chris

Humanoid robots are on the cusp of bringing real-world abundance. We’re past the sci-fi phase. Together, that’s capital, customers, and compute lining up — the usual ingredients before scale. Content already hit abundance — and it changed the value of each piece. You can see the flood in the metrics. In a world like this, … Read more

How to Fit a Curve to Power-law Distributed Data in Python

March 31, 2024 by Chris

In this tutorial, you’ll learn how to generate synthetic data that follows a power-law distribution, plot its cumulative distribution function (CDF), and fit a power-law curve to this CDF using Python. This process is useful for analyzing datasets that follow power-law distributions, which are common in natural and social phenomena. Prerequisites Ensure you have Python … Read more

How to Generate and Plot Random Samples from a Power-Law Distribution in Python?

March 30, 2024 by Chris

To generate random samples from a power-law distribution in Python, use the numpy library for numerical operations and matplotlib for visualization. Here’s a minimal code example to generate and visualize random samples from a power-law distribution: First, we import the necessary libraries: numpy for generating the power-law distributed samples and matplotlib.pyplot for plotting. The a … Read more

How I Created a Football Prediction App on Streamlit

February 5, 2023February 5, 2023 by Jonathan Okah

This tutorial shows you how I created a model to predict football results using Poisson distribution. You’ll learn how I designed an interactive dashboard on Streamlit where our users can select a team and get to know the odds of a home win, draw, or away win. Here’s a live demo of using the app … Read more

The Ultimate Guide to Bivariate Analysis with Python

December 3, 2022December 3, 2022 by Rahul Basu

This article will review some of the critical techniques used in Exploratory Data Analysis, specifically for Bivariate Analysis. We will review some of the essential concepts, understand some of the math behind correlation coefficients and provide sufficient examples in Python for a well-rounded, comprehensive understanding. What is Bivariate Analysis? Exploratory Data Analysis, or EDA, is … Read more

Spearman Rank Correlation in Python

July 1, 2022 by Rebecca Nowack

A prerequisite for a Pearson correlation is normal distribution and metrical data. If your data is not normally distributed or you have variables with ordinal data (like grades, or a Likert scale or a ranked variable from “low” to “high”) you can still calculate a correlation with the Spearman rank correlation. This can be done … Read more

Normal Distribution and Shapiro-Wilk Test in Python

June 4, 2022 by Rebecca Nowack

Normal distribution is a statistical prerequisite for parametric tests like Pearson’s correlation, t-tests, and regression. Testing for normal distribution can be done visually with sns.displot(x, kde=true). The Shapiro-Wilk test for normality can be done quickest with pingouin‘s pg.normality(x). 💡 Note: Several publications note that normal distribution is the least important prerequisite for parametric tests and … Read more

Pearson Correlation in Python

June 4, 2022 by Rebecca Nowack

A good solution to calculate Pearson’s r and the p-value, to report the significance of the correlation, in Python is scipy.stats.pearsonr(x, y). A nice overview of the results delivers pingouin’s pg.corr(x, y). What is Pearson’s “r” Measure? A statistical correlation with Pearson’s r measures the linear relationship between two numerical variables. The correlation coefficient r … Read more

How to Calculate z-scores in Python?

May 28, 2022 by Rebecca Nowack

The z-scores can be used to compare data with different measurements and for normalization of data for machine learning algorithms and comparisons. 💡 Note: There are different methods to calculate the z-score. The quickest and easiest one is: scipy.stats.zscore(). What is the z-score? The z-score is used for normalization or standardization to make differently scaled … Read more