When mentally approaching this problem, you may be tempted to utilize a “for loop”. I admit that’s how my mind was working: iterate through the list one element at a time and check for empty strings. If it’s empty, remove it. Repeat.
Please join me in today’s topic on how to remove empty strings from a list of strings. We’ll see what happens when we modify a list in a “for loop”. Next, we’ll discuss a “while loop” solution. And lastly, we’ll go over some clever one-liners thanks to Python’s built-in features.
Method 1: For Loop
What happens if we use a for loop?
As mentioned earlier, my first instinct is to iterate through the loop and check if the string at the current index is empty. The next step is to simply remove the empty string. Some options we have in Python are the remove()
method – where you specify the value. Or the pop()
method – where you specify the index.
When deciding on which loop to use, my instinct went straight to the “for loop”. This is because we want to repeat the empty string check for each element in the entire length of the list, which can easily be defined as follows:
>>> words = ["The", "", "quick", "", "brown", "", "fox", ""] >>> >>> for i in range(len(words)): ... if words[i] == "": ... words.pop(i)
However when running the above code we get the following message:
'' '' '' '' Traceback (most recent call last): File "<stdin>", line 2, in <module> IndexError: list index out of range >>>
So what is actually happening here? It turns out it is not a good idea to remove elements from a list in a “for loop” because the indices will change!
Here’s an illustration:
Index 0 | Index 1 | Index 2 | Index 3 | Index 4 | Index 5 | Index 6 | Index 7 |
“The” | “” | “quick” | “” | “brown” | “” | “fox” | “” |
By inspecting the above list we can see that we need to remove indices 1, 3, 5, and 7. We will simulate our “for loop”.
First Iteration i
is 0, words[0]
is "The"
. It does not equal ""
. List is unchanged.
Second iteration i
is 1, words[1]
is “”. It equals ""
, so we remove it. Here’s the modified list:
Index 0 | Index 1 | Index 2 | Index 3 | Index 4 | Index 5 | Index 6 |
“The” | “quick” | “” | “brown” | “” | “fox” | “” |
Third iteration i
is 2, words[2]
is ""
. It equals ""
so we remove it. Here’s the modified list:
Index 0 | Index 1 | Index 2 | Index 3 | Index 4 | Index 5 |
“The” | “quick” | “brown” | “” | “fox” | “” |
Fourth iteration i
is 3, words[3]
is ""
, so we remove it. Here’s the modified list:
Index 0 | Index 1 | Index 2 | Index 3 | Index 4 |
“The” | “quick” | “brown” | “fox” | “” |
Fifth iteration i
is 4, words[4]
is ""
, so we remove it. Here’s the modified list:
Index 0 | Index 1 | Index 2 | Index 3 |
“The” | “quick” | “brown” | “fox” |
We can already see that we’ve removed all empty strings however, we still haven’t finished iterating as per our defined “for loop” which states to iterate the length of the words list which was originally 8!
Sixth iteration is 5, words[5]
is out of range, and we will get the error message.
Here’s another variation of the “for loop” where we instead use the remove method to remove the first occurrence in the list.
>>> words = ["The", "", "", "quick", "", "", "brown", "", "fox", ""] >>> for i in words: ... if i == "": ... words.remove(i) ... >>> print(words) ['The', 'quick', 'brown', '', 'fox', ''] >>>
As seen above the code executes without an Index Error. After completing the “for loop” and printing the results, we can see the words
list still contains two empty strings.
Let’s step through each iteration. The highlight will represent the current iterator.
“The” | “” | “” | “quick” | “” | “” | “brown” | “” | “fox” | “” |
1st Iteration i
is "The"
, it does not equal ""
. List is unchanged, iterator advances.
2nd iteration i
is ""
.
“The” | “” | “” | “quick” | “” | “” | “brown” | “” | “fox” | “” |
It equals ""
, so we call the remove function. Note the next empty string is at the current iterator position.
“The” | “” | “quick” | “” | “” | “brown” | “” | “fox” | “” |
However, the Iterator must advance to the next element.
“The” | “” | “quick” | “” | “” | “brown” | “” | “fox” | “” |
3rd iteration i
is "quick"
, it does not equal ""
. List is unchanged, iterator advances.
“The” | “” | “quick” | “” | “” | “brown” | “” | “fox” | “” |
4th iteration i
is ""
. It equals ""
, so we call the remove function. Note the empty string in index 1 is being removed. This shifts the next empty string to the current iterator position.
“The” | “quick” | “” | “” | “brown” | “” | “fox” | “” |
The Iterator advances.
“The” | “quick” | “” | “” | “brown” | “” | “fox” | “” |
5th iteration i
is “brown”
, it does not equal ""
. List is unchanged, iterator advances.
“The” | “quick” | “” | “” | “brown” | “” | “fox” | “” |
6th iteration i
is ""
, so we call the remove function. Note the empty string in index 2 is being removed and causes the current iterator to be “fox”
.
“The” | “quick” | “” | “brown” | “” | “fox” | “” |
The iterator advances.
“The” | “quick” | “” | “brown” | “” | “fox” | “” |
Since the iterator is now at the end of the list, this will be the last comparison.
It equals ""
, so we call the remove function. Note the empty string at index 2 is removed.
The final list:
“The” | “quick” | “brown” | “” | “fox” | “” |
One workaround to use “for loops” is to copy the non empty strings into a new list. Here’s an example:
>>> words = ["The", "", "", "quick", "", "", "brown", "", "fox", ""] >>> new_words = [] >>> for i in words: ... if i != "": ... new_words.append(i) ... >>> print(new_words) ['The', 'quick', 'brown', 'fox'] >>>
Before we discuss the one-line solutions, here’s a clever way to solve it using 2 lines with a “while loop”.
>>> words = ["The", "", "", "quick", "", "", "brown", "", "fox", ""] >>> while "" in set(words): ... words.remove("") ... >>> print(words) ['The', 'quick', 'brown', 'fox'] >>>
As written above, the Python keyword “in” is used for the condition: as long as there’s an empty string in the words list, we will call the remove function on the list. As specified earlier, the remove function will remove the first occurrence in the list.
Some Elegant Alternatives
Have a peek at these alternate solutions and see if you can find ways to fit them into your code. Oh, and if you consider yourself an up-and-coming Pythonist and have been striving to base your coding life on the Zen of Python then these solutions will suit you. As you will soon see these methods align perfectly with the Python philosophy. If you’re not yet familiar with The Zen of Python by Tim Peters, then I invite you to
>>> import this
This is the output:
The Zen of Python, by Tim Peters Beautiful is better than ugly. Explicit is better than implicit. Simple is better than complex. Complex is better than complicated. Flat is better than nested. Sparse is better than dense. Readability counts. Special cases aren't special enough to break the rules. Although practicality beats purity. Errors should never pass silently. Unless explicitly silenced. In the face of ambiguity, refuse the temptation to guess. There should be one-- and preferably only one --obvious way to do it. Although that way may not be obvious at first unless you're Dutch. Now is better than never. Although never is often better than *right* now. If the implementation is hard to explain, it's a bad idea. If the implementation is easy to explain, it may be a good idea. Namespaces are one honking great idea -- let's do more of those! >>>
Method 2: The filter() function
Python’s built-in filter
function uses the following format: filter(function,iterable)
.
For the second parameter – which needs to be iterable – we will pass in our words
list. We can use a lambda function for the first parameter. One possible lambda definition is to specify strings that are not empty. (I’ll mention a couple of alternatives later.)
lambda x: x != ""
Note: according to the Python docs the filter function “constructs an iterator”. Let’s print the result to see what that means.
>>> words = ["The", "", "quick", "", "brown", "", "fox", ""] >>> print(filter(lambda x: x != "", words)) <filter object at 0x7fd5b6a970d0>
The above shows that the contents of the filtered list are not actually printed, and we are left with a filter object. In order to actually see the results, we need to convert it to a list object.
>>> words = ["The", "", "quick", "", "brown", "", "fox", ""] >>> print(list(filter(lambda x: x != "", words))) ['The', 'quick', 'brown', 'fox'] >>>
And if the above lambda expression wasn’t something you instinctively thought of or not as elegant as you’d like, then perhaps these other solutions are more up your alley.
How about defining lambda to check for strings that have a length?
lambda x: len(x)
>>> words = ["The", "", "quick", "", "brown", "", "fox", ""] >>> print(list(filter(lambda x: len(x), words))) ['The', 'quick', 'brown', 'fox'] >>>
As long as a string has a length, it will remain in the list. Otherwise, it’s considered to be an empty string and gets filtered out.
Perhaps this last one is the most elegant, but I’ll leave it with you to decide. Notice we replace the function with Python keyword None
.
>>> words = ["The", "", "quick", "", "brown", "", "fox", ""] >>> print(list(filter(None, words))) ['The', 'quick', 'brown', 'fox'] >>>
Referring to the Python docs: “If function is None, the identity function is assumed, that is, all elements of iterable that are false are removed.” Therefore an empty string is considered to be false in Python and will be filtered out.
Method 3: List Comprehension
Another Python one-liner I invite you to explore is list comprehension. From the Python docs: “A list comprehension consists of brackets containing an expression followed by a for clause, then zero or more for or if clauses”.
Let’s apply that to our list of strings and inspect the list comprehension I defined below.
[i for i in words if i != ""]
The i
will iterate through the words
list. As long as it isn’t an empty string, then it will be added to the new list called new_words
. We simply assign the list comprehension to a variable.
Here’s the full code snippet.
>>> words = ["The", "", "quick", "", "brown", "", "fox", ""] >>> new_words = [i for i in words if i != ""] >>> print(new_words) ['The', 'quick', 'brown', 'fox'] >>>
An alternative for the if statement above is to check that i
has length.
>>> words = ["The", "", "quick", "", "brown", "", "fox", ""] >>> new_words = [i for i in words if len(i)] >>> print(new_words) ['The', 'quick', 'brown', 'fox'] >>>
And that’s how we remove empty strings with list comprehension.
Summary
I certainly hope you enjoyed reading about some Python one-line solutions to removing empty strings from a list of strings. We explored the filter function — keep in mind it will return a filter object, so when you are working with a list be sure to convert the filtered result back into a list. And the other approach we looked at was with Python’s list comprehension solution. Equally clean and clever. I will leave it with you to decide on which method you prefer to use in your next coding project — maybe use both!
Additionally, I hope you are now fully aware of what happens when using a “for loop” to remove elements in a list. As explained above you may get lucky and receive an Index error. But be careful with other situations where you do not receive the error and your code still executes. In our example, the “for loop” completed and left two empty strings in the list!
Lastly, I would like to encourage you to read over The Zen of Python if you haven’t done so already. May it serve as an additional inspiration to code the Python way. And before you know it, you’ll soon find yourself creating beautiful code.