5 Best Ways to Create a Vertical Histogram in Python and Matplotlib

πŸ’‘ Problem Formulation: In data visualization, a histogram is a graphical representation of the distribution of numerical data. The problem we address in this article is how to create a vertical histogram using Python and Matplotlib. Specifically, we’re looking to input a sequence of numbers and produce a vertical histogram that visually represents the frequency distribution of those numbers.

Method 1: Using the bar Function

Matplotlib’s bar function can be used to create vertical histograms by calculating the frequency of elements in intervals (bins) and then plotting those frequencies against the bin labels. This allows for a high degree of customization but requires manual calculation of the histogram data.

Here’s an example:

import matplotlib.pyplot as plt

data = [1, 2, 2, 3, 3, 3, 4, 4, 4, 4]
bins = range(1, 6)
hist, bin_edges = np.histogram(data, bins=bins)

plt.bar(bin_edges[:-1], hist, width=0.5)
plt.show()

Output: A vertical histogram with equivalent bar heights for respective numbers’ frequencies.

The above code computes a histogram using np.histogram for data splitting it into bins. It then creates a vertical histogram with the plt.bar function, positioning the bars in accordance with the bin edges and setting their heights to match the frequencies.

Method 2: Utilizing the hist Function

The hist function in Matplotlib simplifies the histogram creation process by internally computing and plotting the frequency of data across predefined bins. It’s the most straightforward method for creating histograms, offering various customization options.

Here’s an example:

import matplotlib.pyplot as plt

data = [2, 3, 3, 5, 7, 7, 7, 9, 10]
plt.hist(data, bins=5, orientation='vertical')
plt.show()

Output: A neatly organized vertical histogram partitioned into 5 bins, showing the distribution of the numerical data.

This snippet uses plt.hist to automatically calculate and plot the histogram. The orientation parameter is set to ‘vertical’ to ensure the bars are displayed vertically, which is the default behavior but can be explicitly stated for clarity.

Method 3: Stylized Histogram with Seaborn

Seaborn, a statistical data visualization library built on top of Matplotlib, offers aesthetically improved histograms with its distplot function. This method is not only simple but also enhances the visual appeal of the traditional histogram.

Here’s an example:

import seaborn as sns

data = [1, 1, 2, 3, 5, 8, 13, 21]
sns.distplot(data, vertical=True, bins=4, kde=False)
plt.show()

Output: A refined vertical histogram with better default styling.

Seaborn’s distplot automatically computes the histogram data and vertical orientation is achieved through the vertical parameter. The keyword argument kde=False is used to disable the Kernel Density Estimate plot, showing only the histogram.

Method 4: Customizing Histograms with Pandas

Pandas, which is commonly used for data manipulation, can also be employed to plot vertical histograms directly from DataFrames or Series using its plot method with the kind set to ‘hist’. This integration with Matplotlib provides a convenient way to plot graphs directly from data structures.

Here’s an example:

import pandas as pd

data = pd.Series([1, 2, 2, 3, 3, 4, 5])
data.plot(kind='hist', orientation='vertical', rwidth=0.8)
plt.show()

Output: A vertical histogram that’s directly sourced from a Pandas Series object.

The Series object data is used to call the plot method, specifying the type of plot as a histogram and setting the orientation. The rwidth parameter sets the relative bar width with respect to bin size.

Bonus One-Liner Method 5: Compact Histogram with Pyplot

For a quick and straightforward vertical histogram, you can use a one-liner with Matplotlib’s Pyplot interface. This is highly effective for rapid visualization without the fuss of multiple configuration steps.

Here’s an example:

import matplotlib.pyplot as plt

plt.hist([1, 2, 2, 3, 3, 4, 4, 4], bins=4)
plt.show()

Output: An instantly created vertical histogram with 4 bins, showcasing the frequency distribution of the provided data.

This one-liner takes advantage of Matplotlib’s Pyplot simplicity, where plt.hist is directly fed the data and the number of bins desired.

Summary/Discussion

  • Method 1: Using the bar Function. Strengths: Highly customizable, complete control over the histogram display. Weaknesses: Requires manual computation of histogram data.
  • Method 2: Utilizing the hist Function. Strengths: Fast and easy with automatic binning and frequency calculation. Weaknesses: Less control compared to the bar method.
  • Method 3: Stylized Histogram with Seaborn. Strengths: Visually pleasing and easy to create. Weaknesses: An additional dependency if you’re not already using Seaborn for other visualizations.
  • Method 4: Customizing Histograms with Pandas. Strengths: Integrates plotting directly from data structures, making it convenient for data analysis workflows. Weaknesses: Not as flexible for complex histogram customization.
  • Method 5: Compact Histogram with Pyplot. Strengths: Quick and straightforward with minimal code. Weaknesses: Limited customization options.