5 Best Ways to Visualize Multiple Bar Plots in Python Using Bokeh

Rate this post

πŸ’‘ Problem Formulation: When working with data, visualizing different datasets side-by-side can greatly enhance the analysis process. In Python, the Bokeh library can be used to create interactive and visually appealing multi-bar plots. This article provides solutions to showcase comparative data using multiple bar plots with Bokeh, enabling a clear understanding of differences and trends across datasets.

Method 1: Using VBar Glyphs for Side-by-Side Bar Plots

Bokeh’s VBar glyphs allow for the creation of individual bar plots, which can then be arranged side-by-side to create a comparison between different datasets. This method involves crafting each bar plot separately and positioning them accordingly on the x-axis for clarity and visual distinction.

Here’s an example:

from bokeh.plotting import figure, show, output_file
from bokeh.models import ColumnDataSource
from bokeh.transform import dodge

output_file("side_by_side.html")

source = ColumnDataSource(data=dict(fruits=['Apples', 'Oranges', 'Pears'],
                                    counts1=[10, 20, 30],
                                    counts2=[15, 25, 35]))

p = figure(x_range=source.data['fruits'], width=400, height=300)

p.vbar(x=dodge('fruits', -0.1, range=p.x_range), top='counts1', width=0.2, source=source)
p.vbar(x=dodge('fruits', 0.1, range=p.x_range), top='counts2', width=0.2, color="orange", source=source)

show(p)

The output will be a HTML file with a side-by-side bar plot visualizing two different datasets.

This code snippet initializes a Bokeh figure and creates two sets of vertical bars (vbars), each representing a different dataset. The bars are dodged on the x-axis to sit side by side. The use of ColumnDataSource simplifies the data handling, and the show function generates the bar plots in an HTML file.

Method 2: Stacked Bar Charts for Cumulative Data

Bokeh can also produce stacked bar charts, which are ideal for depicting cumulative data or part-to-whole relationships within multiple datasets. This method builds upon the individual bars by stacking them on top of each other rather than placing them side-by-side.

Here’s an example:

from bokeh.plotting import figure, show, output_file

output_file("stacked.html")

fruits = ['Apples', 'Oranges', 'Pears']
counts1 = [10, 20, 30]
counts2 = [15, 25, 35]

p = figure(x_range=fruits, width=400, height=300)

p.vbar_stack(['counts1', 'counts2'], x='fruits', width=0.9, color=["blue", "red"], source=source)

show(p)

The output is a single bar plot where each bar contains sub-bars corresponding to different datasets, stacked on top of one another.

The above code uses vbar_stack to create stacked bars within a single plot. The source data are the same as in the previous method, but this time each fruit has the dataset counts layered vertically, showcasing how they contribute to a total value.

Method 3: Using the gridplot Function to Arrange Multiple Separate Plots

The gridplot function in Bokeh is beneficial when it’s necessary to arrange several separate bar plots into a grid layout, allowing for easy comparison across multiple plots. Each plot operates independently within the grid, providing clear visualization of distinct data sets.

Here’s an example:

from bokeh.plotting import figure, show, gridplot, output_file
from bokeh.models import ColumnDataSource

output_file("grid_layout.html")

source = ColumnDataSource(data=dict(fruits=['Apples', 'Bananas', 'Grapes'], 
                                    data2015=[19, 5, 12], 
                                    data2016=[12, 4, 22]))

p1 = figure(x_range=source.data['fruits'], width=250, height=250, title="2015")
p1.vbar(x='fruits', top='data2015', width=0.5, source=source)

p2 = figure(x_range=source.data['fruits'], width=250, height=250, title="2016")
p2.vbar(x='fruits', top='data2016', width=0.5, source=source)

grid = gridplot([[p1, p2]])

show(grid)

The output will be a grid layout containing multiple bar plots within a single HTML file.

This example demonstrates how to create separate bar plots for different years using Bokeh’s vbar method and then arrange them into a grid layout via the gridplot function. Each plot is a standalone entity with its title and individual interactivity.

Method 4: Interactive Legends for Comparisons on the Same Plot

Interactive legends enable users to click on legend items to hide or show the corresponding bars on the plot, making it an engaging way to compare datasets within the same space. This feature of Bokeh enhances the interactive aspect of multi-bar plots.

Here’s an example:

from bokeh.plotting import figure, show, output_file
from bokeh.models import ColumnDataSource

output_file("interactive_legend.html")

source = ColumnDataSource(data=dict(fruits=['Mangoes', 'Strawberries', 'Kiwi'],
                                    counts1=[24, 18, 9],
                                    counts2=[19, 12, 15]))

p = figure(x_range=source.data['fruits'], width=400, height=300)

b1 = p.vbar(x='fruits', top='counts1', width=0.2, source=source, legend_label="2018")
b2 = p.vbar(x=dodge('fruits', 0.25, range=p.x_range), top='counts2', width=0.2, source=source, color="green", legend_label="2019")

p.legend.click_policy="hide"

show(p)

The output will be an interactive bar plot that allows users to click on the legend to hide or show bar groups.

This code snippet creates an interactive plot with two datasets represented as bars. Clicking on a legend item toggles the visibility of the corresponding bars. The click_policy attribute of the legend is set to “hide” to enable this functionality.

Bonus One-Liner Method 5: Using FactorRange to Simplify Grouped Bar Plots

Bokeh’s FactorRange can handle the plotting of grouped bar plots in a more simplified manner, especially when dealing with categorical data that have subcategories.

Here’s an example:

from bokeh.plotting import figure, show, output_file
from bokeh.models import FactorRange, ColumnDataSource

output_file("grouped_bars.html")

fruits = ['Apples', 'Oranges', 'Pears']
years = ['2015', '2016']
data = {'fruits': fruits,
        '2015': [2, 1, 4],
        '2016': [5, 3, 3]}

x = [(fruit, year) for fruit in fruits for year in years]
counts = sum(zip(data['2015'], data['2016']), ())

source = ColumnDataSource(data=dict(x=x, counts=counts))

p = figure(x_range=FactorRange(*x), width=400, height=300)

p.vbar(x='x', top='counts', width=0.9, source=source)

show(p)

The output will display a neatly organized grouped bar plot within a single HTML file.

The code uses FactorRange to define the x-axis with paired categories, which simplifies the plotting of grouped bars. This one-liner bar plotting approach results in a clean and easily understandable grouped bar plot.

Summary/Discussion

  • Method 1: Using VBar Glyphs for Side-by-Side Bar Plots. Strengths: Offers clear side-by-side comparison for multiple datasets. Weaknesses: As the number of datasets increases, the plot can become cluttered.
  • Method 2: Stacked Bar Charts for Cumulative Data. Strengths: Efficient when visualizing part-to-whole relationships. Weaknesses: Can obscure individual data variations when too many layers are present.
  • Method 3: Using the gridplot Function to Arrange Multiple Separate Plots. Strengths: Maintains each dataset’s distinct visual space. Weaknesses: Requires more space and may be less concise when displaying the data.
  • Method 4: Interactive Legends for Comparisons on the Same Plot. Strengths: Highly interactive and user-friendly. Weaknesses: May not be suitable for static reporting where interactivity isn’t feasible.
  • Bonus Method 5: Using FactorRange to Simplify Grouped Bar Plots. Strengths: Simplifies the code for grouped bar plots. Weaknesses: Less flexible when customizing the appearance of each individual bar.