Python | Split String Between Characters

⭐Summary: The easiest way to split the string between the characters is to slice the string and extract the required split substring. Another effective way to split between characters is to use the re.search method from the regex module.

Minimal Example

# Given
text = "abcdXYZlmn"
left = "abcd"
right = "lmn"

# Method 1.1
import re
result = re.search('%s(.*)%s' % (left, right), text).group(1)
print(result)

# Method 1.2
print(re.findall(re.escape(left) + "(.*)" + re.escape(right), text)[0])

# Method 2: Slice and Extract
# Method 2.1
print(text[len(left):-len(right)])

# Method 2.2
start_index = text.find(left) + len(left)
end_index = text.rfind(right)
print(text[start_index:end_index])

# Method 3
print((text.split(left))[1].split(right)[0])

# Method 4
rest = text.partition(left)[2]
result = rest.partition(right)[0]
print(result)

# Method 5
text = text.replace(left, "*")
text = text.replace(right, "*")
res = text.split("*")
print(res[1])

# Output: XYZ

Problem Formulation

πŸ“œProblem: Given a string; How will you split the string between characters?

Example: Let’s visualize the problem with the help of an example.

You are given a string “Learn Python 3.9 from scratch” and two substrings “Learn” and “from scratch“. You need to split the string between the two substrings and extract the substring “Python 3.9” from the given string.

# input
text = "Learn Python 3.9 from scratch"
left = "Learn"
right = "from scratch"

# Expected Output 
Python 3.9

Method 1: Using re.search

Approach: Use the re.search() method from the regex module to split the string between characters. You need to extract the substring β€œPython 3.9” using the search method. To ensure this, use the pattern, %s(.*)%s which considers all the characters in the string that lies between the given left and right substrings. The method returns a search object.

You can then use the group() function on the search object. It will return a tuple that consists of two items. The first item will be the entire given string and the second item will have the required substring. Hence, we will extract the second item from the tuple using its index to get the final string between the characters.

Code:

import re
# Given
text = "Learn Python 3.9 from scratch"
left = "Learn"
right = "from scratch"

result = re.search('%s(.*)%s' % (left, right), text).group(1)
print(result)

# Python 3.9

Note: The re.search(pattern, string) method matches the first occurrence of the pattern in the string and returns a match object. The method has up to three arguments. The first argument is the pattern of the regular expression that you want to match. The second argument is the string that you want to search and the last argument are the flags (optional).

🌏Related Read: Python Regex Search

Alternate Formulation: Use re.findall

Approach: Another way of using the regular expressions module to solve the given problem is to use the re.findall() method as shown below.

Code:

import re
# Given
text = "Learn Python 3.9 from scratch"
left = "Learn"
right = "from scratch"
print(re.findall(re.escape(left) + "(.*)" + re.escape(right), text)[0])

# Python 3.9

Note: The re.findall(pattern, string) method scans the string from left to right, searching for all non-overlapping matches of the pattern. The functionΒ  returns a list of strings in the matching order- when scanning it from left to right.

🌏Related Read: Python re.findall() – Everything You Need to Know

Method 2: Slice and Extract

String slicing is the concept to carve out a substring from a given string. Use slicing notation s[start:stop: step] to access every step-th element starting from index start (included) and ending in index stop (excluded).

Approach: The idea here is to use the find() method on the given string to find the index of the first element of the left substring in the given text. Add up the length of the entire left substring to this index and store it in a start_index variable. Further, use the rfind() method to find the highest index of the right substring and store it another variable, say end_index. Finally, you can carve out the required substring string between the left and right substrings using the string slicing by slicing the given string from index stored in start_index variable till the end_index variable.

Simply put, you are finding the indices between which the required substring is present and then using string slicing to extract the required substring.

Code:

# Given
text = "Learn Python 3.9 from scratch"
left = "Learn"
right = "from scratch"

start_index = text.find(left) + len(left)
end_index = text.rfind(right)
print(text[start_index:end_index])

# Python 3.9 

Note: The find() method returns the index of the first occurrence of the specified substring. The rfind() method returns the highest index in the string where a substring gets found. It returns -1 if not found.

🌏Related Read:
(i)
String Slicing in Python
(ii) Python String find()
(iii) Python String rfind()

Discussion: Another way to formulate the solution using string slicing in one line is using negative string slicing.Β 

You can use negative indices as start or stop arguments of the string-slicing operation. In this case, Python starts counting from the right. For instance, the negative index -1 points to the last character in the string, the index -2 points to the second last, and so on. Read more here

Code:

# Given
text = "Learn Python 3.9 from scratch"
left = "Learn"
right = "from scratch"

print(text[len(left):-len(right)])
# Python 3.9 

Method 3: Using split()

Approach: First, you must split the text using the left substring as the separator. The resultant list will include an empty substring and the substring after the left substring. You must extract the second element using the index and split the string again using the left substring as the separator. The resultant list will then contain an empty string and the required substring. To extract the required substring, you must use the index and print the first element.

Code:

# Given

# Given
text = "Learn Python 3.9 from scratch"
left = "Learn"
right = "from scratch"
print((text.split(left))[1].split(right)[0])

# Python 3.9 

Note: The split() function splits the string at a given separator and returns a split list of substrings. It returns a list of the words in the string, using sep as the delimiter string.

🌏Related Read: Python String split()

Method 4: Using partition()

The partition() method searches for a separator substring and returns a tuple with three strings: (1) everything before the separator, (2) the separator itself, and (3) everything after it. It then returns a tuple with the same three strings.

Approach: You should use the partition method with the left substring as a separator. As the required substring lies after the separator, you must use the index of the required substring on the returned tuple and store it in a variable (everything after the separator). Next, you must use the partition method again with the right substring as a separator. Now, as the required substring lies before the separator, you must use the index of the required substring on the returned tuple and print the first element (everything before the separator).

Code:

# Given
text = "Learn Python 3.9 from scratch"
left = "Learn"
right = "from scratch"
rest = text.partition(left)[2]
result = rest.partition(right)[0]
print(result)

# Python 3.9 

🌏Related Read: Python String partition()

Method 5: Using replace()

The replace method returns a copy of the string with all occurrences of substring old replaced by new. If the optional argument count is given, only the first count occurrences are replaced.

Approach: You have to use the replace() method and replace the left and right substring in the text with a unique value β€œ*”. Then you can use the split function on the replaced text by using β€œ*” as the delimiter. In order to extract the required substring, the second element gets printed using the index.

Code:

# Given
text = "Learn Python 3.9 from scratch"
left = "Learn"
right = "from scratch"
text = text.replace(left, "*")
text = text.replace(right, "*")
res = text.split("*")
print(res[1])

# Python 3.9 

🌏Related Read: Python String replace()

Conclusion

Hurrah! We have successfully solved the given problem using as many as five different ways. I hope you enjoyed this article and it helps you in your Python coding journey. Please subscribe and stay tuned for more interesting articles!


Do you want to master the regex superpower? Check out my new book The Smartest Way to Learn Regular Expressions in Python with the innovative 3-step approach for active learning: (1) study a book chapter, (2) solve a code puzzle, and (3) watch an educational chapter video.