How to Use Slice Assignment in NumPy?

NumPy slice assignment allows you to use slicing on the left-hand side of an assignment operation to overwrite a specific subsequence of a NumPy array at once. The right side of the slice assignment operation provides the exact number of elements to replace the selected slice. For example, a[::2] = […] would overwrite every other … Read more

How to Rename Column Names in Pandas?

Problem Formulation Given a Pandas DataFrame with column labels, and a list of new column names as strings. How to change the column names to replace the original ones? Here’s an example using the following DataFrame: You want to rename the column names [‘Col_A’, ‘Col_B’, ‘Col_C’] to [‘a’, ‘b’, ‘c’] so that the resulting DataFrame … Read more

NumPy Boolean Indexing

You can index specific values from a NumPy array using another NumPy array of Boolean values on one axis to specify the indices you want to access. For example, to access the second and third values of array a = np.array([4, 6, 8]), you can use the expression a[np.array([False, True, True])] using the Boolean array … Read more

What’s the Best NumPy Book?

Fear of missing out in data science? Data science and machine learning are taking over. Data-driven decision making penetrates every single company nowadays. Data science is indeed the “sexiest job in the 21st century“! There is one Python library which is the basis of any data science related computation you can undertake as a Python … Read more

np.nonzero() – A Simple Guide with Video

This article explains first how the NumPy nonzero() function works. It then goes on to apply it to a practical problem on how to find array elements using the nonzero() function in NumPy in a practical data science example. Syntax numpy.nonzero(a) The np.nonzero(arr) function returns the indices of the elements of an array or Python … Read more

How to Change the Figure Size for a Seaborn Plot?

Seaborn is a comprehensive data visualization library used for the plotting of statistical graphs in Python. It provides fine-looking default styles and color schemes for making more attractive statistical plots. Seaborn is built on the top portion of the matplotlib library and is also integrated closely with data structures from pandas.                                                             How to change … Read more

How to Select Multiple Columns in Pandas

The easiest way to select multiple columns in Pandas is to pass a list into the standard square-bracket indexing scheme. For example, the expression df[[‘Col_1’, ‘Col_4, ‘Col_7’]] would access columns ‘Col_1’, ‘Col_4’, and ‘Col_7’. This is the most flexible and concise way for only a couple of columns. To learn about the best 3 ways … Read more

Python – Inverse of Normal Cumulative Distribution Function (CDF)

Problem Formulation How to calculate the inverse of the normal cumulative distribution function (CDF) in Python? Method 1: scipy.stats.norm.ppf() In Excel, NORMSINV is the inverse of the CDF of the standard normal distribution. In Python’s SciPy library, the ppf() method of the scipy.stats.norm object is the percent point function, which is another name for the … Read more

NumPy Broadcasting – A Simple Tutorial

Broadcasting describes how NumPy automatically brings two arrays with different shapes to a compatible shape during arithmetic operations. Generally, the smaller array is “repeated” multiple times until both arrays have the same shape. Broadcasting is memory-efficient as it doesn’t actually copy the smaller array multiple times. Here’s a minimal example: Let’s have a more gentle … Read more

Logistic Regression in Python Scikit-Learn

Logistic regression is a popular algorithm for classification problems (despite its name indicating that it is a β€œregression” algorithm). It belongs to one of the most important algorithms in the machine learning space. Linear Regression Background Let’s review linear regression. Given the training data, we compute a line that fits this training data so that … Read more

How to Concatenate Two NumPy Arrays?

Problem Formulation Given two NumPy arrays a and b. How to concatenate both? Method 1: np.concatenate() NumPy’s concatenate() method joins a sequence of arrays along an existing axis. The first couple of comma-separated array arguments are joined. If you use the axis argument, you can specify along which axis the arrays should be joined. For … Read more