To filter a list in Python, you can use the built-in filter()
function.
- The first argument is the filtering condition, defined as a
function
. This filtering condition function is often dynamically-created using lambda functions. - The second argument is the iterable to be filtered—the lambda function checks for each element in the iterable whether the element pass the filter or not.
The filter()
function returns an iterator with the elements that pass the filtering condition.
lst = [1, 2, 3, 4, 5] # Filter all elements <3 my_list = filter(lambda x: x<3, lst) print(list(my_list)) # [1, 2]
Syntax
The filter()
function has the following syntax:
filter(function, iterable)
Argument | Description |
---|---|
function | Often a lambda function. It assigns a Boolean value to each element in the iterable to check whether the element will pass the filter or not. |
iterable | Iterable from which to draw the elements to be filtered. |
Return Value | Iterator of filtered elements that pass the test |
You can use the lambda
function statement to create the filtering condition function right as you pass it as an argument. The syntax of the lambda function is lambda x: expression
and it means that you use x
as an input argument and you return expression
as a result (that can or cannot use x
to decide about the return value).
For more information, see my detailed blog article about the lambda function.
lst = [8, 2, 6, 4, 3, 1] # Filter all elements <8 small = filter(lambda x: x<8, lst) print(list(small)) # Filter all even elements even = filter(lambda x: x%2==0, lst) print(list(even)) # Filter all odd elements odd = filter(lambda x: x%2, lst) print(list(odd))
The output is:
# Elements <8 [2, 6, 4, 3, 1] # Even Elements [8, 2, 6, 4] # Odd Elements [3, 1]
The filter()
function returns a filter object that’s an iterable
. To convert it to a list, you use the list(...)
constructor.
[Overview Video] How to Filter a List in Python?
The best ways to filter a list in Python are described in this video:
You can also read over the detailed guide including performance evaluation on the Finxter blog.
Related Article: How to Filter a List in Python?
[Intermediate] Example Filter Out Even Values with Lambda
The filter(function, iterable)
function takes a filter function as an argument that takes one list element as input and returns the Boolean value True
if the condition is met or False
otherwise. This function decides whether an element is included in the filtered list or not.
To define this function, you can use the lambda
keyword. The lambda function is an anonymous function—think of it as a throw-away function that’s only needed as an argument and for nothing else in the code.
Here’s the code that shows how to filter a list using the lambda function to filter a list and returning only the odd values in the list:
# Create the list lst = [1, 2, 3, 4] # Get all odd values print(list(filter(lambda x: x%2, lst))) # [1, 3]
The lambda function lambda x: x%2
takes one argument x
—the element to be checked against the filter—and returns the result of the expression x%2
. This modulo expression returns 1 if the integer is odd and 0 if it is even. Thus, all odd elements pass the test.
[Advanced] Example Lambda Filtering
This example is drawn from my book Python One-Liners (see below).
Real-world data is noisy. But as a data scientist, you get paid to get rid of the noise, make the data accessible, and create meaning. Thus, filtering data is vital for real-world data science tasks.
In this article, you’ll learn how to create a minimal filter function in a single line of code. I first give you the code and explain the basics afterward.
# Option 1 my_list = [x for x in my_list if x.attribute == value] # Option 2 my_list = filter(lambda x: x.attribute == value, my_list)
A popular StackOverflow answer discusses which of the solutions is better. In my opinion, the first option is better because list comprehension is very efficient, there are no function calls, and it has fewer characters. ?
So how to create a function in one line? The lambda function is your friend! Lambda functions are anonymous functions that can be defined in a single line of code. If you want to learn more about lambda functions, check out this 3-min article.
lambda <arguments> : <expression>
You define a comma-separated list of arguments that serve as an input. The lambda function then evaluates the expression and returns the result of the expression.
Without further discussion of the basics, let’s explore how to solve the following data science problem by creating a filter function using the lambda function definition.
Consider the following problem: “Create a filter function that takes a list of books x and a minimal rating y and returns a list of potential bestsellers that have higher than minimal rating y’>y.”
## Dependencies import numpy as np ## Data (row = [title, rating]) books = np.array([['Coffee Break NumPy', 4.6], ['Lord of the Rings', 5.0], ['Harry Potter', 4.3], ['Winnie Pooh', 3.9], ['The Clown of God', 2.2], ['Coffee Break Python', 4.7]]) ## One-liner predict_bestseller = lambda x, y : x[x[:,1].astype(float) > y] ## Results print(predict_bestseller(books, 3.9))
Take a guess, what’s the output of this code snippet?
The data consists of a two-dimensional NumPy array where each row holds the name of the book title and the average user rating (a floating point number between 0.0 and 5.0). There are six different books in the rated data set.
The goal is to create a filter function which takes as input such a book rating data set x and a threshold rating y, and returns a sequence of books so that the books have a higher rating than the threshold y.
The one-liner achieves this objective by defining an anonymous lambda function that simply returns the result of the following expression:
x[x[:,1].astype(float) > y]
The array “x” is assumed to have a shape like our book rating array “books”.
First, we carve out the second column which holds the book ratings and converts it to a float array using the astype(float) method on the NumPy array “x”. This is necessary because the initial array “x” consists of mixed data types (float and strings).
Second, we create a Boolean array which holds the value “True” if the book at the respective row index has a rating larger than “y”. Note that the float “y” is implicitly broadcasted to a new NumPy array so that both operands of the Boolean operator “>” have the same shape.
Third, we use the Boolean array as an indexing array on the original book rating array to carve out all the books that have above-threshold ratings.
The result of this one-liner is the following array:
## Results print(predict_bestseller(books, 3.9)) """ [['Coffee Break NumPy' '4.6'] ['Lord of the Rings' '5.0'] ['Harry Potter' '4.3'] ['Coffee Break Python' '4.7']] """
Python One-Liners Book: Master the Single Line First!
Python programmers will improve their computer science skills with these useful one-liners.
Python One-Liners will teach you how to read and write “one-liners”: concise statements of useful functionality packed into a single line of code. You’ll learn how to systematically unpack and understand any line of Python code, and write eloquent, powerfully compressed Python like an expert.
The book’s five chapters cover (1) tips and tricks, (2) regular expressions, (3) machine learning, (4) core data science topics, and (5) useful algorithms.
Detailed explanations of one-liners introduce key computer science concepts and boost your coding and analytical skills. You’ll learn about advanced Python features such as list comprehension, slicing, lambda functions, regular expressions, map and reduce functions, and slice assignments.
You’ll also learn how to:
- Leverage data structures to solve real-world problems, like using Boolean indexing to find cities with above-average pollution
- Use NumPy basics such as array, shape, axis, type, broadcasting, advanced indexing, slicing, sorting, searching, aggregating, and statistics
- Calculate basic statistics of multidimensional data arrays and the K-Means algorithms for unsupervised learning
- Create more advanced regular expressions using grouping and named groups, negative lookaheads, escaped characters, whitespaces, character sets (and negative characters sets), and greedy/nongreedy operators
- Understand a wide range of computer science topics, including anagrams, palindromes, supersets, permutations, factorials, prime numbers, Fibonacci numbers, obfuscation, searching, and algorithmic sorting
By the end of the book, you’ll know how to write Python at its most refined, and create concise, beautiful pieces of “Python art” in merely a single line.