Summary: There are 3 ways of splitting the given string by tab:
import re num = "123\t45\t\t6\t789" # Method 1 print(num.split()) # OUTPUT: ['123', '45', '6', '789'] # Method 2 print(re.split(r'\t+', num)) # OUTPUT: ['123', '45', '6', '789'] # Method 2 print(re.compile("[^\t]+").findall(num)) # OUTPUT: ['123', '45', '6', '789']
📜Problem: Given a string. How will you split the string by tab?
Let us visualize the problem with the help of an example
# input text = "abc\t\txy\tcda\t\tmnop" # Expected output ['abc', 'xy', 'cda', 'mnop']
Now that we have an overview of our problem let us dive into the solutions without further ado.
Method 1: Using split()
Approach: When you split a string without passing any delimiter, then by default, any whitespace is considered a delimiter. You can use this to your advantage and simply split the given string without passing any delimiter within the
text = "abc\t\txy\tcda\t\tmnop" print(text.split()) # ['abc', 'xy', 'cda', 'mnop']
🌎Related Read: Python String split()
Method 2: Using re.split()
re.split(pattern, string, maxsplit=0, flags=0) method returns a list of strings by matching all occurrences of the pattern in the string and dividing the string along those.
🌎Read More: Python Regex Split
Approach: Use Python’s regex package and call the split method, which takes two arguments. The first argument should be the pattern that you want to match while splitting. In this case, it is a simple tab. So, use the expression as
\t+ which searches for one or more occurrences of a tab. The second argument is the given string (sequence) itself. That’s it!
import re text = "abc\t\txy\tcda\t\tmnop" print(re.split(r'\t+', text)) # ['abc', 'xy', 'cda', 'mnop']
Method 3: Using re.compile()
re.compile(pattern) returns a regular expression object from the
pattern that provides basic regex methods such as
pattern.findall(string). The explicit two-step approach of (1) compiling and (2) searching the pattern is more efficient than calling, say,
search(pattern, string) at once, if you match the same pattern multiple times because it avoids redundant compilations of the same pattern.
🌎Read More: Python Regex Compile
import re text = "abc\t\txy\tcda\t\tmnop" print(re.compile("[^\t]+").findall(text)) # ['abc', 'xy', 'cda', 'mnop']
Do you want to master the regex superpower? Check out my new book The Smartest Way to Learn Regular Expressions in Python with the innovative 3-step approach for active learning: (1) study a book chapter, (2) solve a code puzzle, and (3) watch an educational chapter video.
Given a string containing tabs at the start, middle and end of the string. How will you split the string using a tab as a delimiter? Note that your resultant list must not have empty strings.
Challenge: Consider the code given below. The output contains empty strings. Can you eliminate the empty strings from the list?
# Given colours = '\tRed\tBlack\tYellow\tBlue\t' print(colours.split('\t')) # Output ['', 'Red', 'Black', 'Yellow', 'Blue', ''] # Expected Output ['Red', 'Black', 'Yellow', 'Blue']
filter() method can be used to filter out the empty strings from the list. The function takes
None as the first argument and the list of split strings as the second argument. It then iterates through the list and removes the empty elements. As the
filter() method returns a filter object, we need to use the
list() to convert the object into a list so that it can be viewed in a human-readable form.
colours = '\tRed\tBlack\tYellow\tBlue\t' res = list(filter(None, colours.split('\t'))) print(res)
Note: Python’s built-in
filter() function is used to filter out elements that pass a filtering condition. It takes two arguments:
function assigns a Boolean value to each element in the
iterable to check whether the element will pass the filter or not. It returns an iterator with the elements that pass the filtering condition.
🌎Read More: Python filter()
Hurrah! We have successfully solved the given problem using as many as three different ways. I hope you enjoyed this article and it helps you in your Python coding journey. Please subscribe and stay tuned for more interesting articles!
Google engineers are regular expression masters. The Google search engine is a massive text-processing engine that extracts value from trillions of webpages.
Facebook engineers are regular expression masters. Social networks like Facebook, WhatsApp, and Instagram connect humans via text messages.
Amazon engineers are regular expression masters. Ecommerce giants ship products based on textual product descriptions. Regular expressions rule the game when text processing meets computer science.
If you want to become a regular expression master too, check out the most comprehensive Python regex course on the planet: