How To Split A String And Keep The Separators?

Summary: To split a string and keep the delimiters/separators you can use one of the following methods:

  • Use a regex module and the split() method along with \W special character.
  • Use a regex module and the split() method along with a negative character set [^a-zA-Z0-9].
  • Use a regex module and the split() method along with the either-or metacharacter |.
  • Use a List Comprehension and append the separator.
  • Split using line break: splitlines()

Problem Formulation

๐Ÿ”Problem: Given a string in Python; how to split the string and also keep the separators/delimiter?

A sequence of one or more characters used to separate two or more parts of a given string or a data stream is known as a delimiter or a separator.

Example: Consider that there’s a given string as shown in this example below and you need to split it such that the separators/delimiters are also stored along with the word characters in a list. Please follow the example given below to get an overview of our problem statement.

text = 'finxter,practise@Python*1%every day'
somemethod(text)

Desired Output:

['finxter', ',', 'practice', '@', 'Python', '*', '1', '%', 'every', ' ', 'day']
fig: The Blue Boxes represent the word characters/strings while the Yellow Boxes represent the delimiters/separators.

Now that we have an overview of our problem, let us dive into the solutions without any delay!

Using Regular Expressions (RegEx)

The most efficient way of splitting the string and extract the characters along with the separators is to use regular expressions along with the split() function.

  • split() is an inbuilt method in Python which is used to split a string that matches a regular expression. You can learn more about the split() function by following this article.

Let us have a look at the different regular expressions that can be used to solve our problem:

Method 1: Using ‘(\W)’

One of the ways in which we can split the given string along with the delimiter is to import the regex module and then split the string using the split() function with the help of the “\W” special sequence.

import re

text = 'fnixter,practice@Python*1%every day'
print(re.split('(\W)', text))

Output

['finxter', ',', 'practice', '@', 'Python', '*', '1', '%', 'every', ' ', 'day']

Let us examine and discuss the expression used here:

  • () is used to keep or preserve the separators/delimiters along with the word characters.
  • \W is a special sequence that returns a match where it does not find any word characters in the given string. Here it is used to find the delimiters while splitting the string.

Method 2: Using [^] Set

Another way of splitting the string using regex is to split it using the split() function along with the ([^a-zA-Z0-9]) set group.

Let us have a look at the following example to see how this works:

import re

text = 'finxter,practice@Python*1%every day'
print(re.split('([^a-zA-Z0-9])', text))

Output

['finxter', ',', 'practice', '@', 'Python', '*', '1', '%', 'every', ' ', 'day']

Let us examine the expression used here:

  • () is a grouped set that is used to keep or store separators along with the word characters.
  • [] is used to match a set of characters within the string.
  • [^a-zA-Z0-9] is used to return a match for any character EXCEPT alphabets (both Capital Letters and Small Letters) and Numbers, i.e. it is used to find a delimiter/separator. In this case, the set is used to find a delimiter and split the string into word characters accordingly.

Method 3: Using Either Or (|) Metacharacter To Specify The Delimiters

Another approach to solving our problem is to split the string using the split() function along with the either-or metacharacter | to provide/specify multiple delimiters within the string according to which we want to split the string. A metacharacter is used to convey a special meaning to a regular expression.

In our case the delimiters that we need to specify using the | character are [,|@|%| |*]

Let us have a look at the following program to see how the either-or meta-character works:

import re

text = 'finxter,practice@Python*1%every day'
print(re.split('([,|@|%| |*])', text))

Output

['finxter', ',', 'practice', '@', 'Python', '*', '1', '%', 'every', ' ', 'day']

๐Ÿ“ฆNow let us try a few methods which do not use regular expressions.

๐Ÿ—Š Note: Two other methods need special mention in the list of our solutions. Though they are not the exact solutions to our problem statement. However, they might prove to be handy in different scenarios based on the requirement.

Let us discuss these methods!

Disclaimer: The following have a single type of separator in between the words.

Method 4: Using a List Comprehension And Appending The Separator

Considering the string has a single separator, for e.g:

ip = '192.168.10.32'

To split this string we can use a list comprehension as shown in the snippet below:

ip = '192.168.10.32'
res = [u for x in ip.split('.') for u in (x, '.')]
res.pop()
print(res)

Output

['192', '.', '168', '.', '10', '.', '32', '.']

Method 5: Split Using Line Break: splitlines()

In case the separator needed is a line break, we can use the splitlines() function to split the given string based on the line breaks. The splitlines() inbuilt function is used to split the string breaking at line boundaries.

Let us have a look at the following example to see how the splitlines() function works:

text = """1. This is the first line.
2. This is the second line.
3. This is the third line."""
# If the first argument is set to True, the result includes a newline character at the end of the line.
print(text.splitlines(True))

Output

['1. This is the first line.\n', '2. This is the second line.\n', '3. This is the third line.']

Conclusion

Therefore, in this article, we discussed various methods to split a string and store the word characters along with the separators/delimiters. I highly recommend you to read our Blog Tutorial if you want to master the concept of Python regular expressions.

I hope you enjoyed this article and it helps you in your Python coding journey. Please subscribe and stay tuned for more interesting articles!

๐ŸŒŽRecommended Read: Python | Split String and Keep Whitespace


Where to Go From Here?

Enough theory. Letโ€™s get some practice!

Coders get paid six figures and more because they can solve problems more effectively using machine intelligence and automation.

To become more successful in coding, solve more real problems for real people. Thatโ€™s how you polish the skills you really need in practice. After all, whatโ€™s the use of learning theory that nobody ever needs?

You build high-value coding skills by working on practical coding projects!

Do you want to stop learning with toy projects and focus on practical code projects that earn you money and solve real problems for people?

๐Ÿš€ If your answer is YES!, consider becoming a Python freelance developer! Itโ€™s the best way of approaching the task of improving your Python skillsโ€”even if you are a complete beginner.

If you just want to learn about the freelancing opportunity, feel free to watch my free webinar โ€œHow to Build Your High-Income Skill Pythonโ€ and learn how I grew my coding business online and how you can, tooโ€”from the comfort of your own home.

Join the free webinar now!