# Python Split String by Uppercase (Capital) Letters

Summary: You can use different functions of the regular expressions library to split a given string by uppercase letters. Another approach is to use a list comprehension.

Minimal Example

```# Given String
text = "AbcLmnZxy"
import re
# Method 1
print(re.findall('[A-Z][^A-Z]*', text))
# OUTPUT: ['Abc', 'Lmn', 'Zxy']

# Method 2
print(re.split('(?<=.)(?=[A-Z])', text))
# OUTPUT: ['Abc', 'Lmn', 'Zxy']

# Method 3
print(re.sub( r"([A-Z])", r" \1", text).split())
# OUTPUT: ['Abc', 'Lmn', 'Zxy']

# Method 4
pos = [i for i, e in enumerate(text+'A') if e.isupper()]
print([text[pos[j]:pos[j+1]] for j in range(len(pos)-1)])
# OUTPUT: ['Abc', 'Lmn', 'Zxy']

# Method 5
print("".join([(" "+i if i.isupper() else i) for i in text]).strip().split())
# OUTPUT: ['Abc', 'Lmn', 'Zxy']```

## Problem Formulation

📜Problem: Given a string containing uppercase and lowercase letters. How will you split the string on every occurrence of an uppercase letter?

Example

```# Input
text = "UpperCaseSplitString"
# Output
['Upper', 'Case', 'Split', 'String']```

In the above example, every time an Uppercase character occurs in the given string, the string gets split, and the split substring gets stored in a list.

Let’s dive into the solutions to the given problem.

## Method 1: Using re.findall

Approach: Use `re.findall('[A-Z][^A-Z]*', text)` to split the string whenever an uppercase letter appears. The expression `[A-Z][^A-Z]*` finds all the set of characters that start with an uppercase letter followed by any set of characters. So, every time a match is found, it gets returned to a list.

Code:

```import re
text = "UpperCaseSplitString"
res = re.findall('[A-Z][^A-Z]*', text)
print(res)

#OUTPUT: ['Upper', 'Case', 'Split', 'String']```

Note: The `re.findall(pattern, string)` method scans `string` from left to right, searching for all non-overlapping matches of the `pattern`. It returns a list of strings in the matching order when scanning the string from left to right.

## Method 2: Using re.split

Approach: Once again you can use the regex package and call its split method to split the string on every occurrence of an uppercase letter using the expression `'(?<=.)(?=[A-Z])'`.

Note: The `re.split(pattern, string)` method matches all occurrences of the `pattern` in the `string` and divides the string along the matches resulting in a list of strings between the matches. For example, `re.split('a', 'bbabbbab')` results in the list of strings `['bb', 'bbb', 'b']`.

Code:

```import re
text = "UpperCaseSplitString"
res = re.split('(?<=.)(?=[A-Z])', text)
print(res)

# OUTPUT: ['Upper', 'Case', 'Split', 'String']```

## Method 3: Using re.sub

Approach: Yet another method of the regex package that allows you to split the string based on the occurrence of an uppercase letter is `re.sub`. The idea here is to insert a space after every occurrence of an uppercase letter and then do a normal split on the string using the `split()` method.

Note: The regex function `re.sub(P, R, S)` replaces all occurrences of the pattern `P` with the replacement `R` in string `S`. It returns a new string. For example, if you call `re.sub('a', 'b', 'aabb')`, the result will be the new string `'bbbb'` with all characters `'a'` replaced by `'b'`.

Code:

```import re
text = "UpperCaseSplitString"
res = re.sub( r"([A-Z])", r" \1", text).split()
print(res)

# OUTPUT: ['Upper', 'Case', 'Split', 'String']```

## Method 4: Using List Comprehension

Prerequisite:

List comprehension is a compact way of creating lists. The simple formula is `[expression + context]`.

• Expression: What to do with each list element?
• Context: What elements to select? The context consists of an arbitrary number of for and if statements.

The example `[x for x in range(3)] `creates the list `[0, 1, 2]`.

Approach: The idea here is to use a couple of list comprehensions. The first list comprehension is used to find and store all the positions of each capital letter in the given string. These positions can then be used in another list comprehension to strip out the split strings accordingly.

Code:

```text = "UpperCaseSplitString"
pos = [i for i, e in enumerate(text+'A') if e.isupper()]
parts = [text[pos[j]:pos[j+1]] for j in range(len(pos)-1)]
print(parts)

# OUTPUT: ['Upper', 'Case', 'Split', 'String']```

## Method 5: Using join+strip+split

Here’s another way of using a list comprehension to split the string on every occurrence of an uppercase character.

Code:

```text = "UpperCaseSplitString"
res = "".join([(" "+i if i.isupper() else i) for i in text]).strip().split()
print(res)
# OUTPUT: ['Upper', 'Case', 'Split', 'String']```

The above code can be better understood with the help of a multiline solution shown below:

```# Given String
text = "UpperCaseSplitString"
# resultant list
res = []
# Iterate through the text
for i in text:
# Add a space before the letter if it is in uppercase
if i.isupper():
res.append(" " + i)
else:
res.append(i)
# Convert the resultant list to a string
res = ''.join(res)  # Upper Case Split String
print(res.strip().split())

# OUTPUT: ['Upper', 'Case', 'Split', 'String']```

• The `string.join(iterable)` method concatenates all the string elements in the iterable (such as a list, string, or tuple) and returns the result as a new string. The string on which you call it is the delimiter string—and it separates the individual elements. For example, `'-'.join(['hello', 'world'])` returns the joined string ‘`hello-world`‘.
• `strip` is a built-in function in Python that trims whitespaces on the left and right and returns a new string.