How To Split A String And Keep The Separators?

Summary: To split a string and keep the delimiters/separators you can use one of the following methods:

  • Use a regex module and the split() method along with \W special character.
  • Use a regex module and the split() method along with a negative character set [^a-zA-Z0-9].
  • Use a regex module and the split() method along with the either-or metacharacter |.
  • Use a List Comprehension and append the separator.
  • Split using line break: splitlines()

You can try the first method in our interactive Python shell:

Exercise: Add more words with special delimiters to the text. Does it still work?


Let’s dive into the problem in a step-by-step manner!

Problem: Given a string in Python; how to split the string and also keep the separators/delimiter?

A sequence of one or more characters used to separate two or more parts of a given string or a data stream is known as a delimiter or a separator.

Example: Consider that there’s a given string as shown in this example below and you need to split it such that the separators/delimiters are also stored along with the word characters in a list. Please follow the example given below to get an overview of our problem statement.

text = 'finxter,practise@Python*1%every day'
somemethod(text)

Desired Output:

['finxter', ',', 'practice', '@', 'Python', '*', '1', '%', 'every', ' ', 'day']
fig: The Blue Boxes represent the word characters/strings while the Yellow Boxes represent the delimiters/separators.

Now that we have an overview of our problem, let us dive into the solutions without any delay!

Using Regular Expressions (RegEx)

The most efficient way of splitting the string and extract the characters along with the separators is to use regular expressions along with the split() function.

  • split() is an inbuilt method in Python which is used to split a string that matches a regular expression. You can learn more about the split() function by following this article.

Let us have a look at the different regular expressions that can be used to solve our problem:

Method 1: Using ‘(\W)’

One of the ways in which we can split the given string along with the delimiter is to import the regex module and then split the string using the split() function with the | meta-character.

import re

text = 'fnixter,practice@Python*1%every day'
print(re.split('(\W)', text))

Output

['finxter', ',', 'practice', '@', 'Python', '*', '1', '%', 'every', ' ', 'day']

Let us examine and discuss the expression used here:

  • () is used to keep or store the separators/delimiters along with the word characters.
  • \W is a special sequence that returns a match where it does not find any word characters in the given string. Here it is used to find the delimiters while splitting the string.

Method 2: Using [^] Set

Another way of splitting the string using regex is to split it using the split() function along with the ([^a-zA-Z0-9]) set.

Let us have a look at the following example to see how this works:

import re

text = 'finxter,practice@Python*1%every day'
print(re.split('([^a-zA-Z0-9])', text))

Output

['finxter', ',', 'practice', '@', 'Python', '*', '1', '%', 'every', ' ', 'day']

Let us examine the expression used here:

  • () is used to keep or store separators along with the word characters.
  • [] is used to match a set of characters within the string.
  • [^a-zA-Z0-9] is used to return a match for any character EXCEPT alphabets (both Capital Letters and Small Letters) and Numbers, i.e. it is used to find a delimiter/separator. In this case, the set is used to find a delimiter and split the string into word characters accordingly.

Method 3: Using Either Or (|) Metacharacter To Specify The Delimiters

Another approach to solving our problem is to split the string using the split() function along with the either-or metacharacter | to provide/specify multiple delimiters within the string according to which we want to split the string. A metacharacter is used to convey a special meaning to a regular expression.

In our case the delimiters that we need to specify using the | character are [,|@|%| |*]

Let us have a look at the following program to see how the either-or meta-character works:

import re

text = 'finxter,practice@Python*1%every day'
print(re.split('([,|@|%| |*])', text))

Output

['finxter', ',', 'practice', '@', 'Python', '*', '1', '%', 'every', ' ', 'day']

Now let us try a few methods which do not use regular expressions.

#Note

Two other methods need special mention in the list of our solutions. Though they are not the exact solutions to our problem statement. However, they might prove to be handy in different scenarios based on the requirement.

Let us discuss these methods:

Disclaimer: The following have a single type of separator in between the words.

Method 4: Using a List Comprehension And Appending The Separator

Considering the string has a single separator, for e.g:

ip = '192.168.10.32'

To split this string we can use a list comprehension to achieve a one-line solution as given below:

ip = '192.168.10.32'
print([u for x in ip.split('.') for u in (x, '.')])

Output

['192', '.', '168', '.', '10', '.', '32', '.']

Method 5: Split Using Line Break: splitlines()

In case the separator needed is a line break, we can use the splitlines() function to split the given string based on the line breaks. The splitlines() inbuilt function is used to split the string breaking at line boundaries.

Let us have a look at the following example to see how the splitlines() function works:

text = """1. This is the first line.
2. This is the second line.
3. This is the third line."""
# If the first argument is set to True, the result includes a newline character at the end of the line.
print(text.splitlines(True))

Output

['1. This is the first line.\n', '2. This is the second line.\n', '3. This is the third line.']

Conclusion

Therefore, in this article, we discussed various methods to split a string and store the word characters along with the separators/delimiters. I highly recommend you to read our Blog Tutorial if you want to master the concept of Python regular expressions.

I hope you enjoyed this article and it helps you in your Python coding journey. Please subscribe and stay tuned for more interesting articles!

Where to Go From Here?

Enough theory, let’s get some practice!

To become successful in coding, you need to get out there and solve real problems for real people. That’s how you can become a six-figure earner easily. And that’s how you polish the skills you really need in practice. After all, what’s the use of learning theory that nobody ever needs?

Practice projects is how you sharpen your saw in coding!

Do you want to become a code master by focusing on practical code projects that actually earn you money and solve problems for people?

Then become a Python freelance developer! It’s the best way of approaching the task of improving your Python skills—even if you are a complete beginner.

Join my free webinar “How to Build Your High-Income Skill Python” and watch how I grew my coding business online and how you can, too—from the comfort of your own home.

Join the free webinar now!