π¬ Challenge: Given a Pandas DataFrame. How to find the longest string in a given DataFrame column?
import pandas as pd df = pd.DataFrame(['a', 'aaa', 'aaaaa'], columns=['A']) print(df) A 0 a 1 aaa 2 aaaaa # <-- This is what we want!
We’re going to discuss different variants of this problem next. Let’s get started with the easiest next!
Method 1: Find Length of Longest String in DataFrame Column
To find the length of the longest string in a DataFrame column, use the expression df.COL.str.len().max()
replacing COL
with your custom column name.
import pandas as pd df = pd.DataFrame(['a', 'aaa', 'aaaaa'], columns=['A']) print(df.A.str.len().max()) # 5
This is how the expression df.COL.str.len().max()
works step by step:
df.COL
accesses the columnCOL
of your DataFramedf
.df.COL.str
provides you with different string methods to apply to this column.df.COL.str.len()
converts the column strings to integer length values where each string is converted to its length.df.COL.str.len().max()
gets the maximum column value, i.e., the length of the longest string.
Method 2: Find Index of Longest String in DataFrame Column
To find the index of the longest string in a DataFrame column, use the expression df.COL.str.len().idxmax()
replacing COL
with your custom column name.
import pandas as pd df = pd.DataFrame(['a', 'aaa', 'aaaaa'], columns=['A']) print(df.A.str.len().idxmax()) # 2
This is how the expression df.COL.str.len().max()
works step by step:
df.COL
accesses the columnCOL
of your DataFramedf
.df.COL.str
provides you with different string methods to apply to this column.df.COL.str.len()
converts the column strings to integer length values where each string is converted to its length.df.COL.str.len().idxmax()
gets the index of the maximum column value, i.e., the index of the longest string in the column.
Method 3: Get Longest String in DataFrame Column
To get the longest string in a DataFrame column, first get the index of that string in the column using df.COL.str.len().idxmax()
replacing COL
with your custom column name. Then use normal index such as df['COL'][idx]
to access the value at index idx
in column 'COL'
.
import pandas as pd df = pd.DataFrame(['a', 'aaa', 'aaaaa'], columns=['A']) # 1. Get index of longest string in column idx = df.A.str.len().idxmax() # Index: 2 # 2. Get longest string using df['A'][idx] print('Longest string in column:', df['A'][idx]) # Longest string in column: aaaaa
This is how the expression df.COL.str.len().max()
works step by step:
df.COL
accesses the columnCOL
of your DataFramedf
.df.COL.str
provides you with different string methods to apply to this column.df.COL.str.len()
converts the column strings to integer length values where each string is converted to its length.df.COL.str.len().idxmax()
gets the index of the maximum column value, i.e., the index of the longest string in the column.df['A'][idx]
gets the DataFrame column value of column'A'
and indexidx
.
Thanks for reading through the whole article! If you want to learn more, check out my 5-min Pandas Tutorial here and in the following video:
Also, check out our Python tutorials and free cheat sheets in our email academy:
Programming Humor
π‘ Programming is 10% science, 20% ingenuity, and 70% getting the ingenuity to work with the science.
~~~
- Question: Why do Java programmers wear glasses?
- Answer: Because they cannot C# …!
Feel free to check out our blog article with more coding jokes. π