Creating Interactive Color Scatter Plots with Bokeh in Python

Rate this post

πŸ’‘ Problem Formulation: Data visualization is crucial in understanding complex datasets. This article addresses how to use Bokeh, a powerful Python visualization library, to create color scatter plots that not only present data points but also display additional information upon hovering over these points. Assume you have a DataFrame with columns ‘x’, ‘y’, ‘category’, and ‘value’. You want to plot ‘x’ and ‘y’ on a scatter plot with point colors determined by ‘category’, and when hovering over a point, the ‘category’ and ‘value’ should be shown.

Method 1: Basic Hovertool Scatter Plot with Bokeh

In this method, we will use Bokeh’s HoverTool to create a scatter plot where users can hover over data points to see more information. The color differentiation is achieved by mapping categories to colors with the factor_cmap function, which allows the points to be automatically colored based on their category.

Here’s an example:

from bokeh.plotting import figure, show, output_file
from bokeh.models import HoverTool, ColumnDataSource, CategoricalColorMapper
from bokeh.transform import factor_cmap

# Assuming df is your DataFrame with the columns 'x', 'y', 'category', and 'value'
source = ColumnDataSource(df)

# Define color mapping
color_map = CategoricalColorMapper(factors=df['category'].unique(), palette=bokeh.palettes.d3['Category10'][len(df['category'].unique())])

# Create the figure
p = figure(title="Color Scatter Plot with Hover Information")
p.scatter('x', 'y', source=source, legend_field='category', fill_alpha=0.6, size=10,
          color=factor_cmap('category', palette=bokeh.palettes.d3['Category10'][len(df['category'].unique())], factors=df['category'].unique()))

# Add hover tool
hover = HoverTool()
hover.tooltips = [("Category", "@category"), ("Value", "@value")]
p.add_tools(hover)

# Output and show the plot
output_file("color_scatter_hover.html")
show(p)

The output of this code snippet is an HTML file “color_scatter_hover.html” that contains an interactive scatter plot. Each point represents a data instance, colored by its category, and displays the category and value upon hovering.

This code snippet first prepares the data by wrapping it into a ColumnDataSource, the main object that Bokeh works with to read the data. The CategoricalColorMapper handles the coloring based on the ‘category’ column, and HoverTool is configured to show tooltips with category and value. The plot is then rendered to an HTML file which displays the interactive scatter plot when viewed in a web browser.

Method 2: Grouped Data Hover Information

This method focuses on when you have grouped data, and you want to represent different groups with different colors and hover information. We use the same tools as in Method 1, but we’ll add extra logic to segment data by group.

Here’s an example:

from bokeh.models import HoverTool, ColumnDataSource, CategoricalColorMapper
from bokeh.plotting import figure, show, output_file
from bokeh.transform import factor_cmap

# Assuming 'grouped_data' is a DataFrame grouped by 'category' with aggregated 'x', 'y', and 'value' columns
source = ColumnDataSource(grouped_data)

p = figure(title="Grouped Data Scatter Plot")
p.scatter('x', 'y', source=source, legend_field='category', fill_alpha=0.6, size=10, 
          color=factor_cmap('category', 'Category20', grouped_data['category'].unique()))

hover = HoverTool()
hover.tooltips = [("Category", "@category"), ("Aggregated Value", "@value")]
p.add_tools(hover)

output_file("grouped_data_scatter.html")
show(p)

The script creates a scatter plot titled “Grouped Data Scatter Plot”. The hover tool will display ‘category’ and the aggregated ‘value’.

This approach is particularly useful when dealing with summarized data. The ColumnDataSource is created from the grouped data, and the scatter plot points are colored by category. The tooltips are configured to show the ‘category’ and the ‘aggregated value’. The resulting HTML file is an interactive chart that presents grouped data effectively.

Method 3: Adding Custom Hover Information with HTML

Custom HTML formatting in tooltips can enhance the user experience by including additional HTML elements such as images. In this method, we showcase the usage of custom HTML content within Bokeh hovers for a more engaging tooltip.

Here’s an example:

from bokeh.models import HoverTool, ColumnDataSource
from bokeh.plotting import figure, output_file, show

source = ColumnDataSource({'x': range(10), 'y': range(10), 'desc': ["point {}".format(i) for i in range(10)]})

p = figure(tools="", title="Custom HTML Hover Information")
p.scatter('x', 'y', source=source, size=10, color="navy", alpha=0.5)

hover = HoverTool()
hover.tooltips = """
    
Point Description: @desc
""" p.add_tools(hover) output_file("custom_html_hover_information.html") show(p)

The output is an interactive plot titled “Custom HTML Hover Information” with points that display a custom HTML formatted tooltip containing the point description.

By adjusting the HoverTool tooltips property to contain HTML, the information upon hovering is presented with custom formatting. In this example, a simple strong tag is used for emphasis. Users can incorporate more complex HTML elements based on their needs.

Bonus One-Liner Method 4: Quick Scatter Plot with Hover Information

For rapid prototyping, this one-liner method uses Bokeh’s figure function’s convenience method to quickly instantiate a plot with hover information.

Here’s an example:

from bokeh.plotting import figure, show, output_file

output_file("quick_scatter.html")

p = figure(title="Quick Scatter Plot")
p.scatter('x', 'y', source={'x': range(10), 'y': range(10), 'desc': [f'Point {i}' for i in range(10)]}, 
          size=15, color="green", alpha=0.5, hover_color="orange", hover_alpha=1.0)

show(p)

This results in a scatter plot in a file named “quick_scatter.html” where points change to orange and become opaque upon hovering.

In this compact approach, the plot, scatter points, and interactivity are set up in a few lines. This might be an ideal strategy when the desired output is simple, and time is of the essence. However, the customization options are limited compared to the previous methods.

Summary/Discussion

  • Method 1: Basic Hovertool Scatter Plot with Bokeh. It is simple yet effective for categorically colored data. The setup might require multiple lines of code for customization.
  • Method 2: Grouped Data Hover Information. Tailored to handling and visualizing aggregated data, showing the power of Bokeh with grouped datasets. Requires preparation of data beforehand.
  • Method 3: Adding Custom Hover Information with HTML. Offers the highest level of customization with tooltips, but demands knowledge of HTML and might not be as quick to implement.
  • Method 4: Quick Scatter Plot with Hover Information. Most straightforward implementation, ideal for quick results. Although convenient, it offers the least flexibility of the methods presented.