In this tutorial we will unearth the solutions to three commonly asked questions that users come across while dealing with huge sets of data.
Problem Formulation
Given: Consider the following csv file (Note: You need to use it as a Pandas DataFrame).

import pandas as pd df = pd.read_csv('countries.csv') print(df)
Country Capital Population Area 0 Germany Berlin 84,267,549 348,560 1 France Paris 65,534,239 547,557 2 Spain Madrid 46,787,468 498,800 3 Italy Rome 60,301,346 294,140 4 India Delhi 1,404,495,187 2,973,190 5 USA Washington 334,506,463 9,147,420 6 China Beijing 1,449,357,022 9,388,211 7 Poland Warsaw 37,771,789 306,230 8 Russia Moscow 146,047,418 16,376,870 9 England London 68,529,747 241,930
Here’s the list of the questions that we will be focusing upon in this article:
- How to get the last N rows of a Pandas DataFrame?
- How to get last N rows from last N columns of a Pandas DataFrame?
- How to read last N rows of a large csv file in Pandas?
Recommended Read: How to Select Rows From a DataFrame Based on Column Values?
Without further delay, let us dive into the solutions to the first question and learn how to get the last N rows of a Pandas DataFrame.
Method 1: Using iloc
Approach: Use the iloc
property as pandas.DataFrame.iloc[-n:]
.
The iloc
property is used to get or set the values of specified indices. Select the last n rows using the square bracket notation syntax [-n:] with the iloc
property. Here, -n represents the index of the last n rows of the given pandas DataFrame.
Code:
import pandas as pd df = pd.read_csv('countries.csv') rows = df.iloc[-5:] print(rows)
Output:
Country Capital Population Area 5 USA Washington 334,506,463 9,147,420 6 China Beijing 1,449,357,022 9,388,211 7 Poland Warsaw 37,771,789 306,230 8 Russia Moscow 146,047,418 16,376,870 9 England London 68,529,747 241,930
Method 2: Using tail()
Approach: Use the pandas.DataFrame.tail(n)
to select the last n rows of the given DataFrame.
The tail(n)
method returns n number of methods from the bottom end of the DataFrame. Here, n represents an integer that denotes the number of rows you want to fetch from the bottom end of the DataFrame.
Code:
import pandas as pd df = pd.read_csv('countries.csv') rows = df.tail(5) print(rows)
Output:
Country Capital Population Area 5 USA Washington 334,506,463 9,147,420 6 China Beijing 1,449,357,022 9,388,211 7 Poland Warsaw 37,771,789 306,230 8 Russia Moscow 146,047,418 16,376,870 9 England London 68,529,747 241,930
Well, that brings us to the next question in line – “How to get the last N rows from last N columns of a Pandas DataFrame?”
Method 1: Integer Based Indexing
Approach: Call pandas.DataFrame.iloc[-n:, -m:]
to display last n rows from the last m columns of the given DataFrame.
Code: In the following code snippet we will fetch the last 5 rows from the last 2 columns, i.e., Population and Area.
import pandas as pd df = pd.read_csv('countries.csv') rows = df.iloc[-5:, -2:] print(rows)
Output:
Population Area 5 334,506,463 9,147,420 6 1,449,357,022 9,388,211 7 37,771,789 306,230 8 146,047,418 16,376,870 9 68,529,747 241,930
Method 2: Name Based Indexing
In case, you happen to know the names of the specific columns and you want to get the last N records from the DataFrame from those columns then you can follow a two step process.
- Call the
Pandas.DataFrame.loc(:, 'start_column_name':'end_column_name')
selector. It allows you to use slicing on column names instead of integer identifiers which can be more comfortable. .loc
is for label based indexing. Hence, the negative indices are not found and reindexed toNaN
. Thus, to deal with this you have to use thetail()
method to extract the last N records from the selected columns.
Code: The following code snippet shows how you can use the column names and fetch the corresponding values from the last 5 rows of the given Dataframe.
import pandas as pd df = pd.read_csv('countries.csv') rows = df.loc[:, 'Population':'Area'] print(rows.tail(5))
Output:
Population Area 5 334,506,463 9,147,420 6 1,449,357,022 9,388,211 7 37,771,789 306,230 8 146,047,418 16,376,870 9 68,529,747 241,930
Last but not least, let us solve the third and final problem of today’s tutorial – “How to read last N rows of a large csv file in Pandas?”
Unfortunately, read_csv()
does not facilitate us with any parameter that allows you to directly read the last N lines from a file. This can be a troublesome issue to handle when you are dealing with large datasets.
Thus, a workaround to this problem is to first find out the total number of lines/records in the file. Then use the skiprows
parameter to directly jump to the row/line from which you want to select the records.
Code: In the following code snippet we will fetch the first 5 rows from the csv file into our DataFrame.
import pandas as pd def num_of_lines(fname): with open(fname) as f: for i, _ in enumerate(f): pass return i + 1 num_lines = num_of_lines("countries.csv") n = 5 df = pd.read_csv("countries.csv", skiprows=range(1, num_lines - n)) print(df)
Output:
Country Capital Population Area 0 USA Washington 334,506,463 9,147,420 1 China Beijing 1,449,357,022 9,388,211 2 Poland Warsaw 37,771,789 306,230 3 Russia Moscow 146,047,418 16,376,870 4 England London 68,529,747 241,930
Conclusion
Phew! We have successfully solved all the problems that were presented to us in this tutorial. I hope this tutorial helped you to sharpen your coding skills. Please stay tuned and subscribe for more interesting coding problems.
Recommended Reads:
- Pandas DataFrame head() and tail() Method
- Delete Column from Pandas DataFrame
- Change Column Type in Pandas
Learn Pandas the Fun Way by Solving Code Puzzles
If you want to boost your Pandas skills, consider checking out my puzzle-based learning book Coffee Break Pandas (Amazon Link).
It contains 74 hand-crafted Pandas puzzles including explanations. By solving each puzzle, you’ll get a score representing your skill level in Pandas. Can you become a Pandas Grandmaster?
Coffee Break Pandas offers a fun-based approach to data science mastery—and a truly gamified learning experience.