In this tutorial, we will learn about the Pandas function merge_ordered()
. This method performs a merge with optional interpolation. It is especially useful for ordered data like time series data.
Syntax and Parameters
Here are the parameters from the official documentation:
Parameter | Type | Description |
left | DataFrame | |
right | DataFrame | |
on | label or list | Field names to join on. Must be contained in both DataFrames. |
left_on | label or list, or array-like | Field names to join on in the left DataFrame. |
right_on | label or list, or array-like | Field names to join on in the right DataFrame. |
left_by | column name or list of column names | Group the left DataFrame by group columns and merge piece by piece with the right DataFrame. |
right_by | column name or list of column names | Group the right DataFrame by group columns and merge piece by piece with the left DataFrame. |
fill_method | {'ffill', None} ,default: None | Interpolation method for data. |
suffixes | list-like, default is ( "_x" , "_y" ) | A length-2 sequence where each element is optionally a string indicating the suffix to add to the overlapping column names in left and right respectively. A value of None instead of a string indicates that the column name from left or right should be left as it is. At least one of the values must not be None . |
how | {'left', 'right', 'outer', 'inner'} ,default 'outer' | left : use keys from left data frame onlyright : use keys from right data frame onlyouter : use union of keys from both data framesinner : use intersection of keys from both data frames |
Returns | Type | Description |
DataFrame | The merged DataFrame output type will the be same as ‘ left ’, if it is a subclass of DataFrame. |
Basic Example
To get started, we will create two data frames:
import pandas as pd df1 = pd.DataFrame({ 'Date': ['15/01/2019', '16/01/2019', '17/01/2019', '18/01/2019', '19/01/2019', '20/01/2019'], 'Price': [16.7, 18.4, 20.0, 19.3, 17.1, 21.2] }) print(df1)
Date | Price | |
0 | 15.01.2019 | 16.7 |
1 | 16.01.2019 | 18.4 |
2 | 17.01.2019 | 20.0 |
3 | 18.01.2019 | 19.3 |
4 | 19.01.2019 | 17.1 |
5 | 20.01.2019 | 21.2 |
df2 = pd.DataFrame({ 'Date': ['15/01/2019', '17/01/2019', '18/01/2019', '20/01/2019', '21/01/2019', '22/01/2019'], 'Price': [14.6, 19.8, 21.9, 20.2, 17.4, 18.0] }) print(df2)
Date | Price | |
0 | 15.01.2019 | 14.6 |
1 | 17.01.2019 | 19.8 |
2 | 18.01.2019 | 21.9 |
3 | 20.01.2019 | 20.2 |
4 | 21.01.2019 | 17.4 |
5 | 22.01.2019 | 18.0 |
Here, we import the Pandas library as the first step. Then, we create the two data frames “df1
” and “df2
” which contain a “Date
” column and a “Price
” column respectively.
Now that we created these data frames, in the next step we can perform our first merge_ordered()
operation:
pd.merge_ordered(df1, df2, on='Date')
Date | Price_x | Price_y | |
0 | 15.01.2019 | 16.7 | 14.6 |
1 | 16.01.2019 | 18.4 | NaN |
2 | 17.01.2019 | 20.0 | 19.8 |
3 | 18.01.2019 | 19.3 | 21.9 |
4 | 19.01.2019 | 17.1 | NaN |
5 | 20.01.2019 | 21.2 | 20.2 |
6 | 21.01.2019 | NaN | 17.4 |
7 | 22.01.2019 | NaN | 18.0 |
We apply the merge_ordered()
function and put in the two data frames as the first two arguments of the function. That’s because these are the data frames that we want to merge. The third parameter is the “on
” parameter. This parameter expects the column or a list of columns that we want to perform the merge on. We choose the “Date
” column here.
The outputted data frame is longer than each of the two initial data frames. That’s because, by default, the merge_ordered()
function performs a so-called “outer
” join. That means, we use the union of keys from both our data frames. Since there are eight unique dates, the resulting data frame has eight rows in total.
We also get two price columns: “Price_x
” and “Price_y
“. For each date, we get a price from the left data frame (“Price_x
“) and the right data frame (“Price_y
“). If there is a “NaN
” value, that means, for this specific date, we have only one price value. For example, for the "16.01.2019"
, we do not get a “Price_y
” value because this date is only found in the first data frame.
The “fill_method” parameter
As we saw in the example above, there were some missing values labeled with “NaN
“:
Date | Price_x | Price_y | |
0 | 15.01.2019 | 16.7 | 14.6 |
1 | 16.01.2019 | 18.4 | NaN |
2 | 17.01.2019 | 20.0 | 19.8 |
3 | 18.01.2019 | 19.3 | 21.9 |
4 | 19.01.2019 | 17.1 | NaN |
5 | 20.01.2019 | 21.2 | 20.2 |
6 | 21.01.2019 | NaN | 17.4 |
7 | 22.01.2019 | NaN | 18.0 |
We can get rid of these “NaN
” values by replacing these values with the previous value. We achieve that by applying the “fill_method
” parameter and assigning it to “ffill
“:
pd.merge_ordered(df1, df2, on='Date', fill_method='ffill')
Date | Price_x | Price_y | |
0 | 15.01.2019 | 16.7 | 14.6 |
1 | 16.01.2019 | 18.4 | 14.6 |
2 | 17.01.2019 | 20.0 | 19.8 |
3 | 18.01.2019 | 19.3 | 21.9 |
4 | 19.01.2019 | 17.1 | 21.9 |
5 | 20.01.2019 | 21.2 | 20.2 |
6 | 21.01.2019 | 21.2 | 17.4 |
7 | 22.01.2019 | 21.2 | 18.0 |
Now, we do not have any more missing values here. For example, the “NaN
” value in the “Price_y
” column for the date "16.01.2019"
was replaced with the previous value from that column (“14.6
“).
If multiple values are missing directly one after the other, all missing values get replaced by the last available value. For example, the last two values from the “Price_x
” column were missing. They were both replaced by the value of the third last row which was "21.2"
.
The “suffixes” parameter
In the previous example, the two price columns were named “Price_x
” and “Price_y
” by default. However, we can change these labels by applying the “suffixes
” parameter:
pd.merge_ordered(df1, df2, on='Date', fill_method='ffill', suffixes=['_leftDF', '_rightDF'])
Date | Price_leftDF | Price_rightDF | |
0 | 15.01.2019 | 16.7 | 14.6 |
1 | 16.01.2019 | 18.4 | 14.6 |
2 | 17.01.2019 | 20.0 | 19.8 |
3 | 18.01.2019 | 19.3 | 21.9 |
4 | 19.01.2019 | 17.1 | 21.9 |
5 | 20.01.2019 | 21.2 | 20.2 |
6 | 21.01.2019 | 21.2 | 17.4 |
7 | 22.01.2019 | 21.2 | 18.0 |
We performed the same merge_ordered()
operation as before. But this time, we added the “suffixes
” parameter and assigned it a list with the strings “_leftDF
” and “_rightDF
“. The two price columns in the resulting data frame are now called “Price_leftDF
” and “Price_rightDF
“.
As the name of the parameter suggests, we only change the suffixes here, not the whole label. That’s why the column labels still say “Price
” before the suffixes because the initial column label said “Price
” and we only added the suffixes after that label.
The different kinds of joins
As mentioned in the introduction, by default the merge_ordered()
function performs an outer join. That means we take the union of keys from both data frames.

But we can change that by using the “how
” parameter.
Another type of join is the “inner” join which uses the intersection of keys from both data frames:

pd.merge_ordered(df1, df2, on='Date', fill_method='ffill', how='inner')
Date | Price_x | Price_y | |
0 | 15.01.2019 | 16.7 | 14.6 |
1 | 17.01.2019 | 20.0 | 19.8 |
2 | 18.01.2019 | 19.3 | 21.9 |
3 | 20.01.2019 | 21.2 | 20.2 |
That means, we only get the dates that are found in both data frames.
The remaining two options the “how
” parameter provides us with are the “left
” join and the “right
” join. The left join uses only the keys from the left data frame.

pd.merge_ordered(df1, df2, on='Date', fill_method='ffill', how='left')
Date | Price_x | Price_y | |
0 | 15.01.2019 | 16.7 | 14.6 |
1 | 16.01.2019 | 18.4 | 14.6 |
2 | 17.01.2019 | 20.0 | 19.8 |
3 | 18.01.2019 | 19.3 | 21.9 |
4 | 19.01.2019 | 17.1 | 21.9 |
5 | 20.01.2019 | 21.2 | 20.2 |
Whereas the right join only uses the keys from the right data frame.

pd.merge_ordered(df1, df2, on='Date', fill_method='ffill', how='right')
Date | Price_x | Price_y | |
0 | 15.01.2019 | 16.7 | 14.6 |
1 | 17.01.2019 | 20.0 | 19.8 |
2 | 18.01.2019 | 19.3 | 21.9 |
3 | 20.01.2019 | 21.2 | 20.2 |
4 | 21.01.2019 | 21.2 | 17.4 |
5 | 22.01.2019 | 21.2 | 18.0 |
Grouping by group columns
For this section, we will modify “df1
” a little bit:
df1['Category'] = ['A', 'A', 'A', 'B', 'B', 'B'] print(df1)
Date | Price | Category | |
0 | 15.01.2019 | 16.7 | A |
1 | 16.01.2019 | 18.4 | A |
2 | 17.01.2019 | 20.0 | A |
3 | 18.01.2019 | 19.3 | B |
4 | 19.01.2019 | 17.1 | B |
5 | 20.01.2019 | 21.2 | B |
We added a column called “Category
” and assigned the categories “A
” or “B
” to each row.
“df2
” remains unchanged:
Date | Price | |
0 | 15.01.2019 | 14.6 |
1 | 17.01.2019 | 19.8 |
2 | 18.01.2019 | 21.9 |
3 | 20.01.2019 | 20.2 |
4 | 21.01.2019 | 17.4 |
5 | 22.01.2019 | 18.0 |
Now, we apply the “left_by
” parameter and assign it the value column label “Category
“:
pd.merge_ordered(df1, df2, on='Date', left_by="Category")
Date | Price_x | Category | Price_y | |
0 | 15.01.2019 | 16.7 | A | 14.6 |
1 | 16.01.2019 | 18.4 | A | NaN |
2 | 17.01.2019 | 20.0 | A | 19.8 |
3 | 18.01.2019 | NaN | A | 21.9 |
4 | 20.01.2019 | NaN | A | 20.2 |
5 | 21.01.2019 | NaN | A | 17.4 |
6 | 22.01.2019 | NaN | A | 18.0 |
7 | 15.01.2019 | NaN | B | 14.6 |
8 | 17.01.2019 | NaN | B | 19.8 |
9 | 18.01.2019 | 19.3 | B | 21.9 |
10 | 19.01.2019 | 17.1 | B | NaN |
11 | 20.01.2019 | 21.2 | B | 20.2 |
12 | 21.01.2019 | NaN | B | 17.4 |
13 | 22.01.2019 | NaN | B | 18.0 |
This way, we group the left data frame by the “Category
” column and merge that piece by piece with the right data frame.
When we look at the resulting data frame, we can observe that, for example, the “Price_x
” entry for the date "18.01.2019"
in row 3 is “NaN
” although there is an entry for that date in “df1
“. However, in “df1
“, the date is assigned to the category “B
“. So, in the merged data frame, the “Price_x
” value for the date "18.01.2019"
is found in row 9 with category “B
“.
If we had a group column in the right data frame, we could do the same with the “right_by
” parameter.
Summary
All in all, we learned how to use the Pandas function merge_ordered()
. We saw how to apply the various parameters, how to use the different types of joins, and how to group by group columns.
For more tutorials about Pandas, Python libraries, Python in general, or other computer science-related topics, check out the Finxter Blog page.
Happy Coding!