Python | Split String by Comma and Whitespace

Summary: To split a string by comma and whitespace, you can use a list comprehension that splits the given string by comma and then eliminate the whitespaces from the substrings using the strip method.

Minimal Example:

text = "Tony,   Tom,   Tim"
print([x.strip() for x in text.split(',')])

# ['Tony', 'Tom', 'Tim']

Note that there are different scenarios and many other ways of solving the problem. Please read ahead to discover different scenarios and solutions to dive deep into this topic.

Problem Formulation

Problem: Given a string/sentence containing commas and spaces between the words/substrings. How will you split the string as soon as a comma or space appears in the given string?

Let’s understand the question with the help of an example.

Example 1

# Input:
text = "Deep Learning, Machine Learning, Artificial Intelligence"
# Output:
['Deep', 'Learning', 'Machine', 'Learning', 'Artificial', 'Intelligence']

In the above example, the string has been split into a list of substrings such that each word in the string has been considered as an individual item in the list. Note that the words have been separated by a space or a comma.

Example 2

# Input:
text = "Mouse,   Cat,   Dog"
# Output:
['Mouse', 'Cat', 'Dog']

In the above example, the string has comma-separated words such that each word has multiple spaces between them as well as a comma.

Therefore, we have two scenarios – (i) A space and a comma that appear separately in the given string. (ii) Spaces and a comma that appears together between the substrings in the given string.


Without further ado, let us dive into the mission-critical question and solve the given problem.

Method 1: Using a List Comprehension

A quick one-line solution to the given problem can be formulated with the help of a list comprehension.

Solution to Example 1:

text = "Deep Learning, Machine Learning, Artificial Intelligence"
result = [ele for x in text.split(',') for ele in x.split()]
print(result)

# ['Deep', 'Learning', 'Machine', 'Learning', 'Artificial', 'Intelligence']

Explanation: The idea used in the above solution is to initially split the given string using the split method using comma as the separator. We then split each item of this list by using a whitespace as the separator which returns the list containing split substrings. To understand the working principle of the solution used in the above list comprehension, follow the multiline solution given below:

text = "Deep Learning, Machine Learning, Artificial Intelligence"
res = []
for ele in text.split(','):
    for i in ele.split():
        res.append(i)
print(res)

Solution to Example 2: The solution to the second scenario is way easier as you just have to split the string using a comma and then eliminate the withe spaces that occur in the returned substrings with the help of the strip method.

text = "Mouse,   Cat,   Dog"
result = [x.strip() for x in text.split(',')]
print(result)

# ['Mouse', 'Cat', 'Dog']

Method 2: Using split() and replace()

Solution to Example 1:

The idea here is to replace the spaces within the given string with a comma with the help of the replace method. It becomes pretty straightforward after that. You simply have to split the string using comma as the separator using the split() method.

text = "Deep Learning, Machine Learning, Artificial Intelligence"
text = text.replace(' ', ',').split(',')
while '' in text:
    text.remove('')
print(text)

# ['Deep', 'Learning', 'Machine', 'Learning', 'Artificial', 'Intelligence']

Solution to Example 2:

In the second scenario, you can replace the spaces with an empty character. This eliminates all the spaces in the string, leaving it only with the commas. You can then split it using the split method and using comma as the separator.

text = "Mouse,   Cat,   Dog"
text = text.replace(' ', '').split(',')
print(text)

# ['Mouse', 'Cat', 'Dog']

Method 3: Using regex

The regular expressions library leverages you with the power to solve difficult problems in a flash. The following illustration shows you how the sub() function of the regex module works:

Approach: Use the sub() function of the regular expressions library and replace all the whitespace characters in the given string with a comma and then split the given string using a normal split method containing a comma as the separator.

Solution to Example 1:

Note that after you split the given string and store it in the list variable res , it will return a list containing split substrings along and spaces as items. In order to eliminate the whitespaces from the resultant list, you can use a loop to iterate through the items of the list and then remove the spaces one by one.

import re
text = "Deep Learning, Machine Learning, Artificial Intelligence"
res = re.sub(r'\s', ',', text).split(',')
while '' in res:
    res.remove('')
print(res)

# ['Deep', 'Learning', 'Machine', 'Learning', 'Artificial', 'Intelligence']

Solution to Example 2:

import re
text = "Mouse,   Cat,   Dog"
res = re.sub(r'\s', '', text).split(',')
print(res)

# ['Mouse', 'Cat', 'Dog']

Method 4: Using map , strip and split

This approach is pretty similar to the idea followed in method 1.The only difference here is you are mapping the split substrings (i.e. the strings that have been split by comma) to the strip method to eliminate the spaces and create individual items. The list() constructor finally allows you to transform the map object into a list object containing the final output.

Solution to Example 1:

text = "Deep Learning, Machine Learning, Artificial Intelligence"
li = list(map(str.strip, text.split(',')))
res = []
for i in li:
    for j in i.split():
        res.append(j)
print(res)

# ['Deep', 'Learning', 'Machine', 'Learning', 'Artificial', 'Intelligence']

Solution to Example 2:

text = "Mouse,   Cat,   Dog"
res = list(map(str.strip, text.split(',')))
print(res)

# ['Mouse', 'Cat', 'Dog']

Conclusion

I hope you enjoyed the numerous scenarios and challenges used in this tutorial to help you learn the different ways of splitting a string by a comma and whitespace. Please subscribe and stay tuned for more interesting tutorials and solutions.

🌎Related Reads:
πŸ‘‰ How to Create a List from a Comma-Separated String
πŸ‘‰ How to Remove a Comma from a String? 5 Best Ways

Happy coding! πŸ™‚


Do you want to master the regex superpower? Check out my new book The Smartest Way to Learn Regular Expressions in Python with the innovative 3-step approach for active learning: (1) study a book chapter, (2) solve a code puzzle, and (3) watch an educational chapter video.