π‘ Problem Formulation: A common task is extracting a list from a row in a DataFrame. Given a DataFrame, how do we extract a row as a list? Assume you have a DataFrame df and you want to retrieve the nth row as a list, preserving the order of the columns.
β₯οΈ Info: Are you AI curious but you still have to create real impactful projects? Join our official AI builder club on Skool (only $5): SHIP! - One Project Per Month
Method 1: Using iloc and values attributes
The simplest method is to use the iloc indexer to select the row, and then the values attribute to get a representation of that row as a NumPy array. You can then convert this array to a list.
A minimal example:
import pandas as pd
# Sample DataFrame
df = pd.DataFrame({'A': [1, 2], 'B': [3, 4], 'C': [5, 6]})
# Convert the first row to a list
row_as_list = df.iloc[0].values.tolist()
print(row_as_list) # Output: [1, 3, 5]
Here, df.iloc[0] selects the first row of the DataFrame, values converts it into a NumPy array, and finally, tolist() is called on the array to get a list.
π Python Create List From DataFrame Column in Pandas
Method 2: Using a List Comprehension
Another approach is to employ a list comprehension that iterates over the columns and extracts the value at a particular row index.
Code Example:
# Continuation of Method 1's sample DataFrame # Convert the first row to a list using list comprehension row_as_list = [df[col].iloc[0] for col in df.columns] print(row_as_list) # Output: [1, 3, 5]
Here, we loop through each column in df.columns and select the first row value using df[col].iloc[0]. This creates a list with the elements of the first row.
Method 3: Using the to_numpy() method
If you prefer to avoid using the values attribute, you can use the to_numpy() method, which also converts the selected row to a NumPy array that can then be converted to a list as demonstrated previously.
Here’s an example:
# Continuation of Method 1's sample DataFrame # Convert the first row to a list using to_numpy and tolist methods row_as_list = df.iloc[0].to_numpy().tolist() print(row_as_list) # Output: [1, 3, 5]
The to_numpy() method directly converts the DataFrame row to a NumPy array. The tolist() method is then called to convert that array into a list.
Method 4: Using the at[] accessor
For very large DataFrames, you might want efficiency; you can use the at[] accessor with a list comprehension. This is similar to method 2 but uses at[] for better performance during access of a single element.
Here’s an example:
# Continuation of Method 1's sample DataFrame # Convert the first row to a list using at accessor and list comprehension row_as_list = [df.at[0, col] for col in df.columns] print(row_as_list) # Output: [1, 3, 5]
The at[] accessor efficiently retrieves a single value from a specified row/column pair. We iterate over the columns and use df.at[0, col] to construct the list with the first row elements.
π Pandas DataFrame to Python List β and Vice Versa
Summary/Discussion
Each method serves the same purpose but is suited to different scenarios and preferences.
The iloc with values is straightforward but can be replaced with to_numpy() if you prefer using Pandas’ own methods.
The list comprehension approaches, on the other hand, offer more explicit control and can be more efficient with at[] in certain large DataFrames.
