π‘ Problem Formulation: In data analysis, it’s not uncommon to work with two time series that have different date or time spacings. For instance, one might have daily temperature readings while another contains monthly economic indices. The challenge lies in graphing these time series together on one plot for comparison while maintaining the integrity of their respective time scales. Our goal is to illustrate methods using Python’s Matplotlib library to achieve this effectively.
Method 1: Use Secondary Axis
The first method involves creating a secondary axis to accommodate the second time series. This allows for different scales on the same plot, making it easy to compare time series with different resolutions. Matplotlib’s twinx()
function is perfect for this task.
Here’s an example:
import matplotlib.pyplot as plt import pandas as pd # Create two time series with different spacing ts_daily = pd.date_range('2020-01-01', periods=30, freq='D') data_daily = np.random.randn(30) ts_monthly = pd.date_range('2020-01-01', periods=2, freq='M') data_monthly = np.random.randn(2) fig, ax1 = plt.subplots() # Plot first time series on the primary axis ax1.plot(ts_daily, data_daily, 'g-') ax1.set_xlabel('Date') ax1.set_ylabel('Daily Data', color='g') # Create a secondary axis and plot the second time series ax2 = ax1.twinx() ax2.plot(ts_monthly, data_monthly, 'b-') ax2.set_ylabel('Monthly Data', color='b') plt.show()
The output is a plot with a green line representing the daily data on the primary y-axis and a blue line representing the monthly data on the secondary y-axis.
This code snippet creates a plot with two y-axes using Matplotlib’s subplots()
to initialize the plot and twinx()
to add a secondary y-axis. Each series is plotted separately against its corresponding x-axis (which represents time), allowing for a clear comparison despite different spacings in time.
Method 2: Normalize Time Series to a Common Scale
In this method, both time series are normalized to a common scale, such as by the number of days since a starting point, and then plotted on the same graph. This requires preprocessing your data but keeps your plot simple without additional axes.
Here’s an example:
import matplotlib.pyplot as plt import pandas as pd from matplotlib.dates import date2num # Continuing with ts_daily and ts_monthly as defined previously # Convert timestamps to a common scale (number of days) days_daily = date2num(ts_daily) days_monthly = date2num(ts_monthly) plt.plot(days_daily, data_daily, 'g-', label='Daily Data') plt.plot(days_monthly, data_monthly, 'b--', label='Monthly Data') plt.xlabel('Days since start') plt.legend() plt.show()
The output exhibits both time series represented on a common x-axis scale, with daily data in green solid lines and monthly data in blue dashed lines.
This example demonstrates how to rescale date-time objects to a common numeric scale using Matplotlib’s date2num()
function. Normalizing both time series to the “number of days since the start” enables plotting them together on the same axes without any secondary axes needed.
Method 3: Interpolate Time Series
This method involves interpolating one or both time series to create a common time interval before plotting. This can be advantageous when you want to visualize trends with missing data points or align disparate datasets precisely.
Here’s an example:
import matplotlib.pyplot as plt import pandas as pd # Assuming ts_daily and ts_monthly are as defined previously # Resample and interpolate the monthly data to a daily frequency df_monthly = pd.Series(data_monthly, index=ts_monthly) df_daily_interpolated = df_monthly.resample('D').interpolate() plt.plot(ts_daily, data_daily, 'g-', label='Daily Data') plt.plot(df_daily_interpolated.index, df_daily_interpolated, 'b--', label='Interpolated Monthly Data') plt.xlabel('Date') plt.legend() plt.show()
The output graph showcases the original daily data series in green solid lines against the interpolated monthly data series in blue dashed lines, both plotted over the same time interval.
The code snippet above demonstrates how to resample and interpolate a monthly time series using Pandasβ resample()
and interpolate()
methods. As a result, the monthly data is represented on a daily scale, enabling a more seamless comparison.
Method 4: Use Markers to Distinguish Series
In this method, we plot both time series on the same axis but use distinct markers or styles to differentiate them. This is particularly useful for quickly identifying which data points belong to which series when the focus is on the temporal relation rather than the exact timing.
Here’s an example:
import matplotlib.pyplot as plt # Continuing with ts_daily and ts_monthly as defined previously plt.plot(ts_daily, data_daily, 'go', label='Daily Data') # 'go' specifies green circles plt.plot(ts_monthly, data_monthly, 'bs', label='Monthly Data') # 'bs' specifies blue squares plt.xlabel('Date') plt.legend() plt.show()
The output presents two series: daily values marked with green circles and monthly values marked with blue squares, both plotted on one coherent time axis.
The example employs different markers for the two time series to allow easy differentiation. The plot()
function’s marker argument specifies that daily data points are green circles (‘go’) and monthly data points are blue squares (‘bs’), which makes for an intuitive comparison on a single graph.”
Bonus One-Liner Method 5: Overlay Plots with Different Linestyles
For a quick comparison, we can overlay the time series using different linestyles. This method is perfect for a brief visual analysis where the emphasis is on the overall trends rather than the precise data points.
Here’s an example:
plt.plot(ts_daily, data_daily, 'g-', label='Daily Data') # Solid line for daily data plt.plot(ts_monthly, data_monthly, 'b:', label='Monthly Data') # Dotted line for monthly data plt.xlabel('Date') plt.legend() plt.show()
The output will portray the daily time series with a solid green line and the monthly time series with a dotted blue line.
In this concise example, we used different linestyles to differentiate the series: a solid line for the daily data and a dotted line for the monthly data. With a single line of code for each series, we’re able to achieve an effective overlay for visual comparison.
Summary/Discussion
- Method 1: Use Secondary Axis. Provides a clear distinction between the scales of two time series. However, it can be complex to read and understand the correlation between series.
- Method 2: Normalize Time Series to a Common Scale. Simplifies the comparison of series by using a single y-axis. However, it can lose the meaning of original time intervals.
- Method 3: Interpolate Time Series. Aligns disparate datasets effectively. But it can potentially introduce artificial data points that may not reflect the true nature of the time series.
- Method 4: Use Markers to Distinguish Series. Easy to implement and visually intuitive. Might not be ideal for densely packed data points where markers can overlap and cause confusion.
- Method 5: Overlay Plots with Different Linestyles. Simple, quick, and effective for overarching trend analysis. But details can be lost when exact comparisons of data are required.