Pandas DataFrame plot.box() Method


Preparation

Before any data manipulation can occur, three (3) new libraries will require installation.

  • The Pandas library enables access to/from a DataFrame.
  • The Matplotlib library displays a visual graph of a plotted dataset.
  • The Scipy library allows users to manipulate and visualize the data.

To install these libraries, navigate to an IDE terminal. At the command prompt ($), execute the code below. For the terminal used in this example, the command prompt is a dollar sign ($). Your terminal prompt may be different.

$ pip install pandas

Hit the <Enter> key on the keyboard to start the installation process.

$ pip install matplotlib

Hit the <Enter> key on the keyboard to start the installation process.

$ pip install scipy

Hit the <Enter> key on the keyboard to start the installation process.

If the installations were successful, a message displays in the terminal indicating the same.


Feel free to view the PyCharm installation guide for the required libraries.


Add the following code to the top of each code snippet. This snippet will allow the code in this article to run error-free.

import pandas as pd
import matplotlib.pyplot as plt
import scipy

DataFrame Plot Box

The dataframe.plot.box() method creates a Box-and-Whisker plot from the DataFrame column(s). In short, this type of plot encapsulates the minimum, first quarter, median, third quarter, and maximum values of a dataset.

For a detailed definition of a Box plot, click here.

The syntax for this method is as follows:

DataFrame.plot.box(by=None, **kwargs)
ParameterDescription
byThis parameter is a string and signifies the column to group the DataFrame.
**kwargsThe keyword arguments for the method

For this example, Rivers Clothing requires a Box plot. This documents how its stock is performing on the Stock Exchange. The stock prices are reviewed twice a day for three (3) days in January (1st, 15th, and 30th).

stock_dates  = ['Jan-01', 'Jan-01', 'Jan-15', 'Jan-15', 'Jan-30', 'Jan-30']
stock_prices = [3.34, 1.99, 2.25, 4.57, 5.74, 3.65]
ax = plt.gca()

df = pd.DataFrame({'Stock Date':  stock_dates, 'Stock Price': stock_prices})
boxplot = df.boxplot(column=['Stock Price'], by='Stock Date', grid=True, rot=30, fontsize=10, ax=ax)
plt.show()
  • Line [1] creates a list of dates and saves them to stock_dates.
  • Line [2] Gets the current access (gca()) and saves it to ax.
  • Line [3] creates a list of stock prices and saves to stock_prices.
  • Line [4] creates a DataFrame from the variables saved above.
  • Line [5] does the following:
    • Creates the Box chart based on the Stock Prices and Dates.
    • Displays the grid lines on the chart.
    • Rotates the date labels at the chart bottom by 30 degrees.
    • Sets the font size to 10.
    • Sets the ax created above.
  • Line [6] outputs the Box chart on-screen.

The buttons on the bottom left can be used to further manipulate the chart.

πŸ’‘Β Note: Another way to create this chart is with the plot() method and the kind parameter set to the 'box' option.

More Pandas DataFrame Methods

Feel free to learn more about the previous and next pandas DataFrame methods (alphabetically) here:

Also, check out the full cheat sheet overview of all Pandas DataFrame methods.