Data Science

How to Change the Figure Size for a Seaborn Plot?

Seaborn is a comprehensive data visualization library used for the plotting of statistical graphs in Python. It provides fine-looking default styles and color schemes for making more attractive statistical plots. Seaborn is built on the top portion of the matplotlib library and is also integrated closely with data structures from pandas.                                                             How to change …

How to Change the Figure Size for a Seaborn Plot? Read More »

How to Select Multiple Columns in Pandas

The easiest way to select multiple columns in Pandas is to pass a list into the standard square-bracket indexing scheme. For example, the expression df[[‘Col_1’, ‘Col_4, ‘Col_7’]] would access columns ‘Col_1’, ‘Col_4’, and ‘Col_7’. This is the most flexible and concise way for only a couple of columns. To learn about the best 3 ways …

How to Select Multiple Columns in Pandas Read More »

Python – Inverse of Normal Cumulative Distribution Function (CDF)

Problem Formulation How to calculate the inverse of the normal cumulative distribution function (CDF) in Python? Method 1: scipy.stats.norm.ppf() In Excel, NORMSINV is the inverse of the CDF of the standard normal distribution. In Python’s SciPy library, the ppf() method of the scipy.stats.norm object is the percent point function, which is another name for the …

Python – Inverse of Normal Cumulative Distribution Function (CDF) Read More »

NumPy Broadcasting – A Simple Tutorial

Broadcasting describes how NumPy automatically brings two arrays with different shapes to a compatible shape during arithmetic operations. Generally, the smaller array is “repeated” multiple times until both arrays have the same shape. Broadcasting is memory-efficient as it doesn’t actually copy the smaller array multiple times. Here’s a minimal example: Let’s have a more gentle …

NumPy Broadcasting – A Simple Tutorial Read More »

Logistic Regression in Python Scikit-Learn

Logistic regression is a popular algorithm for classification problems (despite its name indicating that it is a “regression” algorithm). It belongs to one of the most important algorithms in the machine learning space. Linear Regression Background Let’s review linear regression. Given the training data, we compute a line that fits this training data so that …

Logistic Regression in Python Scikit-Learn Read More »

How to Convert a Boolean Array to an Integer Array in Python?

Problem Formulation Given a NumPy array consisting of Boolean values. How to convert it to an integer array? Convert each True value to integer 1, and Convet each False value to integer 0. Here’s an example Boolean array: What you want is the following integer array: Let’s examine some methods to accomplish this easily. Method …

How to Convert a Boolean Array to an Integer Array in Python? Read More »

Division in Python

The double-frontslash // operator performs integer division and the single-frontslash / operator performs float division. An example for integer division is 40//11 = 3. An example for float division is 40/11 = 3.6363636363636362. A crucial lesson you need to master as a programmer is “division in Python”. What does it mean to divide in Python? …

Division in Python Read More »

[Tutorial] K-Means Clustering with SKLearn in One Line

If there is one clustering algorithm you need to know – whether you are a computer scientist, data scientist, or machine learning expert – it’s the K-Means algorithm. In this tutorial drawn from my book Python One-Liners, you’ll learn the general idea and when and how to use it in a single line of Python …

[Tutorial] K-Means Clustering with SKLearn in One Line Read More »

Smoothing Your Data with the Savitzky-Golay Filter and Python

This article deals with signal processing. More precisely, it shows how to smooth a data set that presents some fluctuations, in order to obtain a resulting signal that is more understandable and easier to be analyzed. In order to smooth a data set, we need to use a filter, i.e. a mathematical procedure that allows …

Smoothing Your Data with the Savitzky-Golay Filter and Python Read More »