How To Cut A String In Python?

Problem: Given a string; how to split/cut the string and extract the required characters?

In this article, we will be discussing some interesting scenarios which allow us to split or cut a string and extract the necessary portion of the string that we need. Let us dive into each example/scenario and have a look at how we can successfully cut the string based on the requirement in each scenario.

✨ Scenario 1

Problem Formulation

Given the following string:

s = 'http://www.example.com/?s=something&two=20'

Requirement:

You have to split the string such that whatever is after & in the given string (‘url’) is neglected, i.e., the output string should be as follows:

s = 'http://www.example.com/?s=something

◈ Method 1: Using split() Method

split() is a built-in method in Python which is used to cut/split a given string based on a given separator. You can specify any separator according to your requirement, however, by default the separator is a whitespace.

Syntax:

  • separator is an optional parameter which is used to specify the separator (delimiters). By default it is any whitespace character.
  • maxsplit is an optional parameter which allows us to specify the maximum number of splits that we want to perform. By default its value is -1 that is “all occurences”.

The Solution : You can use the split() method and specify the separator based on which you want to cut the string and then extract the section of the string from the list generated by the split() function. Let us have a look at how this can be implemented in the following piece of code:

s = 'http://www.example.com/?s=something&two=20'
print(s.split('&')[0])

Output:

http://www.example.com/?s=something

◈ Method 2: Using rfind() Method And Slicing The String

We need to extract the portion of the string which is prior to the & character. Therefore, a simple work-around for our problem is to find the index of the & character in the string with the help of the rfind() function and then slice the string using the index.

Note: The rfind() method is used to find the last occurrence of a specified value.

The Solution

s = 'http://www.example.com/?s=something&two=20'
print(s[:s.rfind('&')])

Output:

http://www.example.com/?s=something

◈ Method 3: Using index() Method

Another simple approach to cut the given string is to slice it using the index method. The index(value) method returns the index of the value argument . Let us have a look at the procedure to implement the index(value) method and spit our string.

s = 'http://www.example.com/?s=something&two=20'
print(s[:s.index('&')])

Output:

http://www.example.com/?s=something

In this scenario, the task of cutting the string was quite simple since there was a single delimiter and all we had to do was separate the string based on the delimiter & . What if you want to extract the string by eliminating more than a single character or sequence. That brings us to the next scenario!

✨ Scenario 2

Problem Formulation

Given a string consisting of numbers, letters and special characters; how to split the string whenever a special character or a number occurs?

Example

string = "Finxter$#! Academy Python111Freelancing"

Desired Output

['Finxter', 'Academy', 'Python', 'Freelancing']

◈ Method 1: Using re.split

The re.split(pattern, string) method matches all occurrences of the pattern in the string and divides the string along the matches resulting in a list of strings between the matches. For example, re.split('a', 'bbabbbab') results in the list of strings ['bb', 'bbb', 'b'].

The Solution

import re

s = "Finxter$#! Academy Python111Freelancing"
res = re.split('\d+|\W+', s)
print(res)

Output:

['Finxter', 'Academy', 'Python', 'Freelancing']

Note:

  • The \d special character matches any digit between 0 and 9.
  • \W is a special sequence that returns a match where it does not find any word characters in the given string. Here it is used to find the delimiters while splitting the string.

In case you want to store the separators as well, please have a look at this tutorial which will answer you question in details.

◈ Method 2: Using itertools.groupby()

  • The itertools.groupby(iterable, key=None) function creates an iterator that returns tuples (key, group-iterator) grouped by each value of key. We use the str.isalpha() function as the key function.
  • The str.isalpha() function returns True if the string only consists of alphabetic characters.

The Solution

from itertools import groupby
s = "Finxter$#! Academy Python111Freelancing"
r=[]
res = [''.join(g) for _, g in groupby(s, str.isalpha)]
for item in res:
    if item.isalpha():
        r.append(item)
print(r)

Output:

['Finxter', 'Academy', 'Python', 'Freelancing']

✨ Scenario 3

If you are specifically dealing with URLs then you would want to use built-in libraries that deal with URLs.

Example: You want to remove two=20 from the query string given below:

s='http://www.domain.com/?s=some&two=20'

Desired Output:

http://www.domain.com/?s=some

Solution

  • Step 1: parse the entire URL.
  • Step 2: Extract the query string.
  • Step 3: Convert it to a Python dictionary.
  • Step 4: Remove the key ‘two’ from the dictionary.
  • Step 5: Put it back into the query string.
  • Step 6: Stich the URL back together.

Let us have a look at the following program which demonstrates the exact process as explained in the above steps. (Please follow the comments in the code!)

import urllib.parse

# Step 1: parse the entire URL
parse_result = urllib.parse.urlsplit("http://www.example.com/?s=something&two=20")
# Step 2: Extract the query string
query_s = parse_result.query
# Step 3: Convert it to a Python dictionary
query_d = urllib.parse.parse_qs(parse_result.query)
# Step 4: remove the ['two'] key from the dictionary
del query_d['two']
# Step 5: Put it back to the query string
new_query_s = urllib.parse.urlencode(query_d, True)
# Step 6: Stitch the URL back together
result = urllib.parse.urlunsplit((
    parse_result.scheme, parse_result.netloc,
    parse_result.path, new_query_s, parse_result.fragment))
print(result)

Output:

http://www.example.com/?s=something

The advantage of using the above procedure is that you have more control over the URL. For example, if you only wanted to remove the two argument from the query string even if it occurred earlier in the query string ("two=20&s=something"), this would still be functional and work perfectly fine.

Conclusion

In this article, you have learned some important concepts regarding splitting a string in Python. Select the procedure that suits your requirements and implement them accordingly as demonstrated in this article with the help of numerous scenarios. This brings us to the end of this article; please stay tuned and subscribe for more solutions and interesting discussions.

Where to Go From Here?

Enough theory, let’s get some practice!

To become successful in coding, you need to get out there and solve real problems for real people. That’s how you can become a six-figure earner easily. And that’s how you polish the skills you really need in practice. After all, what’s the use of learning theory that nobody ever needs?

Practice projects is how you sharpen your saw in coding!

Do you want to become a code master by focusing on practical code projects that actually earn you money and solve problems for people?

Then become a Python freelance developer! It’s the best way of approaching the task of improving your Python skills—even if you are a complete beginner.

Join my free webinar “How to Build Your High-Income Skill Python” and watch how I grew my coding business online and how you can, too—from the comfort of your own home.

Join the free webinar now!