Pandas pct_change(), quantile(), rank(), round(), prod(), product()

The Pandas DataFrame has several methods concerning Computations and Descriptive Stats. When applied to a DataFrame, these methods evaluate the elements and return the results.

Preparation

Before any data manipulation can occur, two (2) new libraries will require installation.

• The NumPy library supports multi-dimensional arrays and matrices in addition to a collection of mathematical functions.

To install these libraries, navigate to an IDE terminal. At the command prompt (`\$`), execute the code below. For the terminal used in this example, the command prompt is a dollar sign (`\$`). Your terminal prompt may be different.

`\$ pip install pandas`

Hit the `<Enter>` key on the keyboard to start the installation process.

`\$ pip install numpy`

Hit the `<Enter>` key on the keyboard to start the installation process.

If the installations were successful, a message displays in the terminal indicating the same.

Feel free to view the PyCharm installation guide for the required libraries.

Add the following code to the top of each code snippet. This snippet will allow the code in this article to run error-free.

```import pandas as pd
import numpy as np ```

DataFrame pct_change()

The `pct_change()` method calculates and returns the percentage change between the current and prior element(s) in a DataFrame. The return value is the caller.

To fully understand this method and other methods in this tutorial from a mathematical point of view, feel free to watch this short tutorial:

The syntax for this method is as follows:

`DataFrame.pct_change(periods=1, fill_method='pad', limit=None, freq=None, **kwargs)`

This example calculates and returns the percentage change of four (4) fictitious stocks over three (3) months.

```df = pd.DataFrame({'ASL':   [18.93, 17.03, 14.87],
'DBL':   [39.91, 41.46, 40.99],
'UXL':   [44.01, 43.67, 41.98]},
index=   ['2021-10-01', '2021-11-01', '2021-12-01'])

result = df.pct_change(axis='rows', periods=1)
print(result)```
• Line [1] creates a DataFrame from a dictionary of lists and saves it to `df`.
• Line [2] uses the `pc_change()` method with a selected axis and period to calculate the change. This output saves to the `result` variable.
• Line [3] outputs the result to the terminal.

Output

💡 Note: The first line contains `NaN` values as there is no previous row.

DataFrame quantile()

The `quantile()` method returns the values from a DataFrame/Series at the specified quantile and axis.

The syntax for this method is as follows:

`DataFrame.quantile(q=0.5, axis=0, numeric_only=True, interpolation='linear')`

To fully understand the `interpolation` parameter from a mathematical point of view, feel free to check out this tutorial:

This example uses the same stock DataFrame as noted above to determine the quantile(s).

```df = pd.DataFrame({'ASL':   [18.93, 17.03, 14.87],
'DBL':   [39.91, 41.46, 40.99],
'UXL':   [44.01, 43.67, 41.98]})

result = df.quantile(0.15)
print(result)```
• Line [1] creates a DataFrame from a dictionary of lists and saves it to `df`.
• Line [2] uses the `quantile()` method to calculate by setting the `q` (quantile) parameter to 0.15. This output saves to the `result` variable.
• Line [3] outputs the result to the terminal.

Output

DataFrame rank()

The `rank()` method returns a DataFrame/Series with the values ranked in order. The return value is the same as the caller.

The syntax for this method is as follows:

`DataFrame.rank(axis=0, method='average', numeric_only=None, na_option='keep', ascending=True, pct=False)`

For this example, a CSV file is read in and is ranked on Population and sorted. Click here to download and move this file to the current working directory.

```df = pd.read_csv("countries.csv")
df["Rank"] = df["Population"].rank()
df.sort_values("Population", inplace=True)
print(df)```
• Line [1] reads in the `countries.csv` file and saves it to `df`.
• Line [2] appends a column to the end of the DataFrame (`df`).
• Line [3] sorts the CSV file in ascending order.
• Line [4] outputs the result to the terminal.

Output

DataFrame round()

The `round()` method rounds the DataFrame output to a specified number of decimal places.

The syntax for this method is as follows:

`DataFrame.round(decimals=0, *args, **kwargs)`

For this example, the Bank of Canada’s mortgage rates over three (3) months display and round to three (3) decimal places.

Code Example 1

```df = pd.DataFrame([(2.3455, 1.7487, 2.198)], columns=['Month 1', 'Month 2', 'Month 3'])
result = df.round(3)
print(result)```
• Line [1] creates a DataFrame complete with column names and saves it to `df`.
• Line [2] rounds the mortgage rates to three (3) decimal places. This output saves to the `result` variable.
• Line [3] outputs the result to the terminal.

Output

Another way to perform the same task is with a Lambda!

Code Example 2

```df = pd.DataFrame([(2.3455, 1.7487, 2.198)],
columns=['Month 1', 'Month 2', 'Month 3'])
result = df.apply(lambda x: round(x, 3))
print(result)```
• Line [1] creates a DataFrame complete with column names and saves it to `df`.
• Line [2] rounds the mortgage rates to three (3) decimal places using a Lambda. This output saves to the `result` variable.
• Line [3] outputs the result to the terminal.

💡 Note: The output is identical to that of the above.

DataFrame prod() and product()

The `prod()` and `product()` methods are identical. Both return the product of the values of a requested axis.

The syntax for these methods is as follows:

`DataFrame.prod(axis=None, skipna=None, level=None, numeric_only=None, min_count=0, **kwargs)`
`DataFrame.product(axis=None, skipna=None, level=None, numeric_only=None, min_count=0, **kwargs)`

For this example, random numbers generate, and the product on the selected axis returns.

```df = pd.DataFrame({'A':   [2, 4, 6],
'B':   [7, 3, 5],
'C':   [6, 3, 1]})

index_ = ['A', 'B', 'C']
df.index = index_

result = df.prod(axis=0)
print(result)```
• Line [1] creates a DataFrame complete with random numbers and saves it to `df`.
• Line [2-3] creates and sets the DataFrame index.
• Line [3] calculates the product along axis 0. This output saves to the `result` variable.
• Line [4] outputs the result to the terminal.

Output

Formula Example: 2*4*6=48

Further Learning Resources

This is Part 5 of the DataFrame method series.

• Part 1 focuses on the DataFrame methods `abs()`, `all()`, `any()`, `clip()`, `corr()`, and `corrwith()`.
• Part 2 focuses on the DataFrame methods `count()`, `cov()`, `cummax()`, `cummin()`, `cumprod()`, `cumsum()`.
• Part 3 focuses on the DataFrame methods `describe()`, `diff()`, `eval()`, `kurtosis()`.
• Part 4 focuses on the DataFrame methods `mad()`, `min()`, `max()`, `mean()`, `median()`, and `mode()`.
• Part 5 focuses on the DataFrame methods `pct_change()`, `quantile()`, `rank()`, `round()`, `prod()`, and `product()`.
• Part 6 focuses on the DataFrame methods `add_prefix()`, `add_suffix()`, and `align()`.
• Part 7 focuses on the DataFrame methods `at_time()`, `between_time()`, `drop()`, `drop_duplicates()` and `duplicated()`.
• Part 8 focuses on the DataFrame methods `equals()`, `filter()`, `first()`, `last(), head()`, and `tail()`
• Part 9 focuses on the DataFrame methods `equals()`, `filter()`, `first()`, `last()`, `head()`, and `tail()`
• Part 10 focuses on the DataFrame methods `reset_index()`, `sample()`, `set_axis()`, `set_index()`, `take()`, and `truncate()`
• Part 11 focuses on the DataFrame methods `backfill()`, `bfill()`, `fillna()`, `dropna()`, and `interpolate()`
• Part 12 focuses on the DataFrame methods `isna()`, `isnull()`, `notna()`, `notnull()`, `pad()` and `replace()`
• Part 13 focuses on the DataFrame methods `drop_level()`, `pivot()`, `pivot_table()`, `reorder_levels()`, `sort_values()` and `sort_index()`
• Part 14 focuses on the DataFrame methods `nlargest()`, `nsmallest()`, `swap_level()`, `stack()`, `unstack()` and `swap_axes()`
• Part 15 focuses on the DataFrame methods `melt()`, `explode()`, `squeeze()`, `to_xarray()`, `t()` and `transpose()`
• Part 16 focuses on the DataFrame methods `append()`, `assign()`, `compare()`, `join()`, `merge()` and `update()`
• Part 17 focuses on the DataFrame methods `asfreq()`, `asof()`, `shift()`, `slice_shift()`, `tshift()`, `first_valid_index()`, and `last_valid_index()`
• Part 18 focuses on the DataFrame methods `resample()`, `to_period()`, `to_timestamp()`, `tz_localize()`, and `tz_convert()`
• Part 19 focuses on the visualization aspect of DataFrames and Series via plotting, such as `plot()`, and `plot.area()`.
• Part 20 focuses on continuing the visualization aspect of DataFrames and Series via plotting such as hexbin, hist, pie, and scatter plots.
• Part 21 focuses on the serialization and conversion methods `from_dict()`, `to_dict()`, `from_records()`, `to_records()`, `to_json()`, and `to_pickles()`.
• Part 22 focuses on the serialization and conversion methods `to_clipboard()`, `to_html()`, `to_sql()`, `to_csv()`, and `to_excel()`.
• Part 23 focuses on the serialization and conversion methods `to_markdown()`, `to_stata()`, `to_hdf()`, `to_latex()`, `to_xml()`.
• Part 24 focuses on the serialization and conversion methods `to_parquet()`, `to_feather()`, `to_string()`, `Styler`.
• Part 25 focuses on the serialization and conversion methods `to_bgq()` and `to_coo()`.

Also, have a look at the Pandas DataFrame methods cheat sheet!