5 Best Ways to Plot a Grouped Horizontal Bar Chart with all Columns in Python Pandas

πŸ’‘ Problem Formulation: Visualizing complex datasets with several categories and subcategories can be challenging. A grouped horizontal bar chart is a common requirement for presenting comparative data across multiple columns. Here, we tackle the problem of plotting such a chart using Python’s Pandas library. We start with a DataFrame with multiple columns and seek a grouped horizontal bar chart that displays all column data side by side for comparison.

Method 1: Using Pandas plot with `barh` and `groupby`

Pandas is a powerful data manipulation library that also supports basic plotting capabilities. By utilizing the plot method with the barh type, we can create horizontal bar charts. We combine this with the groupby function to group the data according to categories before plotting.

Here’s an example:

import pandas as pd
import matplotlib.pyplot as plt

# Sample data
df = pd.DataFrame({
    'Category': ['A', 'A', 'B', 'B'],
    'Subcategory': ['One', 'Two', 'One', 'Two'],
    'Values': [10, 20, 15, 25]
})

# Group the data and plot
df.set_index(['Category', 'Subcategory']).unstack().plot(kind='barh', stacked=False)

plt.show()

The output is an unstacked grouped horizontal bar chart showcasing the values of each subcategory within the categories A and B.

In this code snippet, we first organize the DataFrame to set the categories and subcategories as the index, then unstack the levels to facilitate the grouping in our plot. Utilizing Matplotlib, the plt.show() function is then called to display the plot.

Method 2: Using seaborn’s `barplot` with hue parameter

Seaborn is a statistical visualization library built on top of Matplotlib. It simplifies creating complex plots like grouped bar charts with its barplot function. The `hue` parameter is particularly useful for adding a categorical dimension to the data, ideal for side-by-side grouping.

Here’s an example:

import pandas as pd
import seaborn as sns

# Sample data
df = pd.DataFrame({
    'Category': ['A', 'A', 'B', 'B'],
    'Subcategory': ['One', 'Two', 'One', 'Two'],
    'Values': [10, 20, 15, 25]
})

# Plot using seaborn
sns.barplot(data=df, y='Category', x='Values', hue='Subcategory', orient='h')

plt.show()

The output is a grouped horizontal bar chart with distinct colors for each subcategory within the main categories.

This code demonstrates the simplicity of creating grouped bars in Seaborn. The DataFrame directly feeds into the `sns.barplot` method, which automatically handles the grouping and visual differentiation between categories with the help of `hue`.

Method 3: Customizing with Matplotlib’s `barh`

For those seeking full control over their plots, Matplotlib’s barh function is the way to go. It allows for detailed customization and manual plotting of grouped horizontal bars with a little more coding effort.

Here’s an example:

import matplotlib.pyplot as plt
import numpy as np

# Sample data
categories = ['A', 'B']
subcategories = ['One', 'Two']
values = np.array([[10, 20], [15, 25]])

# Determine bar positions
category_pos = np.arange(len(categories))
subcat_width = 0.35

fig, ax = plt.subplots()
for i, subcat in enumerate(subcategories):
    ax.barh(category_pos - subcat_width/2. + i*(subcat_width), values[:, i], subcat_width, label=subcat)

ax.set_yticks(category_pos)
ax.set_yticklabels(categories)
ax.legend()

plt.show()

The output is a customized grouped horizontal bar chart with each subcategory displayed next to each other within the main categories.

This code manually computes the positions for each group of bars. The `barh` method is called repeatedly for each subcategory, with the positions and widths adjusted to place the bars side by side.

Method 4: Advanced Plotting with Plotly

Plotly is an interactive plotting library that can generate complex and stylish charts. For a grouped horizontal bar chart, Plotly requires a bit more setup but provides interactivity and a modern look to the visualization.

Here’s an example:

import plotly.graph_objects as go

# Sample data
categories = ['A', 'B']
subcategories = ['One', 'Two']
values = [[10, 20], [15, 25]]

# Create figure
fig = go.Figure()
for i, subcat in enumerate(subcategories):
    fig.add_trace(go.Bar(
        y=[cat + " " + subcat for cat in categories],
        x=values[:, i],
        name=subcat,
        orientation='h'
    ))

fig.update_layout(barmode='group')
fig.show()

The output is an interactive grouped horizontal bar chart that can be zoomed in and out and offer tooltips on hover.

Within this snippet, a new Plotly figure is instantiated. Multiple bar traces are added to the figure, each representing a subcategory. Each bar is positioned horizontally using the `orientation` parameter. The layout is updated to group these bars, replicating the grouped bar chart style.

Bonus One-Liner Method 5: Using pandas pivot and plot shorthand

For quick plotting without much customization, one can pivot the DataFrame and directly plot with the Pandas plotting shorthand.

Here’s an example:

import pandas as pd

# Sample data
df = pd.DataFrame({
    'Category': ['A', 'A', 'B', 'B'],
    'Subcategory': ['One', 'Two', 'One', 'Two'],
    'Values': [10, 20, 15, 25]
})

# Quick pivot and plot
df.pivot("Category", "Subcategory", "Values").plot(kind='barh')

plt.show()

The output is a simple grouped horizontal bar chart without much visual distinction between the groups.

This one-liner involves creating a pivot table from our DataFrame, where each cell represents the value for a combination of the main category and subcategory. The result is then immediately plotted as a horizontal bar chart.

Summary/Discussion

  • Method 1: Pandas Groupby and Barh Plot. Simple and straightforward with minimal dependencies. Limited customization options.
  • Method 2: Seaborn Barplot with Hue. Easy syntax and aesthetically pleasing. May not allow for as much fine-tuning as Matplotlib.
  • Method 3: Matplotlib Barh Customization. Highest control over chart elements. Requires more code and manual handling.
  • Method 4: Plotly Interactive Plot. Produces interactive and appealing visuals, perfect for web applications. More complex setup.
  • Method 5: Pandas Pivot Plot Shorthand. Excellent for quick and dirty plots. Not suitable for complex visual storytelling.