Summary: One of the easiest ways to split a string after every n character is to use a list comprehension to and slice the string accordingly to extract every n
character of the given string and store them in a list. The items of this list represent the required split substrings.
A quick Look at the solution: [(given_string[i:i+n]) for i in range(0, len(given_string), n)]
Minimal Example
given_string = 'abcdef' n = 3 print([(given_string[i:i+n]) for i in range(0, len(given_string), n)]) # OUTPUT: ['abc', 'def']
Problem Formulation
πProblem: Given a string, How will you split the string after every n
characters?
Let’s visualize the problem with the help of an example:
Example: In the problem given below, you have to split the string after every 3 characters –
# Input: s = "12345abcde6789fghi" n = 3 # Output: ['123', '45a', 'bcd', 'e67', '89f', 'ghi']
Now that you have a clear picture of what the question asks you to do let us dive into the solutions without any further ado.
Method 1: Using a List Comprehension
Prerequisite: In order to understand the solution given below, it is essential to know what a list comprehension does. Simply put, list comprehension in Python is a compact way of creating lists. The simple formula is [expression + context]
, where the “expression
” determines what to do with each list element. And the “context
” determines what elements to select. The context can consist of an arbitrary number of for
and if
statements.
πTo learn more about list comprehensions, read this article on βList Comprehension in Python β A Helpful Illustrated Guideβ
Approach: Split the given string after every n
characters using a list comprehension, such that the list comprehension returns a new list containing n
characters of the given string.
Code:
# Given text s = "12345abcde6789fghi" n = 3 # Using list comprehension op = [(s[i:i + n]) for i in range(0, len(s), n)] # Printing the output print(op) # ['123', '45a', 'bcd', 'e67', '89f', 'ghi']
Let’s look at what the above code does by dissecting it into the expression and context part.
- Expression: The expresssion
(s[i:i + n])
returns a sliced substring that represents each split string obtained by splitting the given string after everyn
characters. - Context:
- The context contains a
for
loop that allows you to iterate through a sequence of values ranging from 0 until the length of the given string such that the values taken into consideration in each iteration are multiples of 3. Here, the range function allows you to determine the sequence over which the loop will iterate. Note that therange
function has a step size of “n
” which ensures that in every iteration “n
” (3 in this case) values are taken into account. - For example, in the above code, the context variable “
i
” will return 0 in the first iteration, then in the second iterationi
will return 3, again in the third iterationi
will return 6 until the entire length of the string has been traversed. - Finally, the expression returns and stores all the split substrings in a new list which can then be displayed as the output.
- The context contains a
Multi-line Solution:
The above code can also be written in a simple form by using a for
loop to iterate across individual characters of the given string instead of using a list comprehension. You can store the split strings in a new list with the help of the append() method.
Code:
# Given text text = "12345abcde6789fghi" n = 3 # Empty list to store the resultant split strings op = [] # For loop to cut the given string for i in range(0, len(text), n): op.append(text[i:i + n]) # Printing the output print(op) # OUTPUT: ['123', '45a', 'bcd', 'e67', '89f', 'ghi']
Method 2: Using zip_longest from Itertools Module
The itertools module consists of different functions that return iterators. The zip_longest
function is one function from the module that makes an iterator that aggregates elements from each of the iterables. The iteration will continue till the longest iterable is not exhausted.
Syntax:
zip_longest(fillvalue = None, *iterables)
The function takes two arguments:
- The
fillvalue
parameter is the value that gets filled where the iterables are of uneven length. - The
iterables
parameter denotes the sequence over which we want to iterate.
Code:
# Importing the function from the itertools module from itertools import zip_longest # Splitting string using zip_longest def fun(n, i, fillvalue=None): # This code groups as follows: ('abcdefg', 5, 'x') --> abc def g5x" args = [iter(i)] * n return zip_longest(fillvalue=fillvalue, *args) # Given text my_string = "12345abcde6789fghi" n = 3 # List of the separated string op_str = [''.join(l) for l in fun(n, my_string, '')] # Output list initialization op = [] # Converting the list for a in op_str: op.append(a) # Printing the output print(op) # OUTPUT: ['123', '45a', 'bcd', 'e67', '89f', 'ghi']
Note: The iter()
method returns an iterator for the given argument.
Method 3: Using the regex Module
We can split the string with every n
character using the re.findall()
method from the regex
module. The re.findall(pattern, string)
method scans the string from left to right, searching for all non-overlapping matches of the pattern. When scanning the string from left to right, it returns a list of strings in the matching order.
πRelated Tutorial: “Python re.findall() β Everything You Need to Know.”
Code:
# Importing the regex module import re # Using re.findall() method r = re.findall('.{1,3}','12345abcde6789fghi') print(r) # OUTPUT: ['123', '45a', 'bcd', 'e67', '89f', 'ghi']
Method 4: Using Textwrap
Python provides you with a built-in function to solve this problem directly without any hassle. The function name is wrap
and it is a part of the textwrap
module in Python. Simply pass the given string and the number of characters to wrap
funcition as the parameters and it will automatically split the string after every n
characters.
Here’s a quick look at what the docstring for the wrap
function says:
help(wrap) ''' Help on function wrap in module textwrap: wrap(text, width=70, **kwargs) Wrap a single paragraph of text, returning a list of wrapped lines. Reformat the single paragraph in 'text' so it fits in lines of no more than 'width' columns, and return a list of wrapped lines. By default, tabs in 'text' are expanded with string.expandtabs(), and all other whitespace characters (including newline) are converted to space. See TextWrapper class for available keyword args to customize wrapping behaviour. '''
Okay! Let’s see the wrap
function in action:
Code:
from textwrap import wrap s = "12345abcde6789fghi" n = 3 print(wrap(s, n)) # OUTPUT: ['123', '45a', 'bcd', 'e67', '89f', 'ghi']
Method 5: Using list+map+join+zip
Another approach to solve the given problem is to use a combination of the list()
, map()
, join()
and zip()
functions to split the string accordingly. Follow the code given below that demonstrates how to solve the problem using these functions.
Code:
s = "12345abcde6789fghi" n = 3 print(list(map(''.join, zip(*[iter(s)]*n)))) # OUTPUT: ['123', '45a', 'bcd', 'e67', '89f', 'ghi']
Method 6: Using sliced
Yet another function that allows you to split the given string after every n characters is the sliced method of the more_itertools
module. The sliced
function returns a sliced object, hence, you need to convert that to a list containing the split substrings with the help of the list()
constructor as shown below.
Code:
from more_itertools import sliced s = "12345abcde6789fghi" n = 3 print(list(sliced(s, n))) # OUTPUT: ['123', '45a', 'bcd', 'e67', '89f', 'ghi']
As a matter of fact, the more_itertools
module offers you many other options to solve the given problem. Here are two more ways that you can use to split the given string after n
characters –
import more_itertools as mit s = "12345abcde6789fghi" n = 3 print(["".join(c) for c in mit.chunked(s, n)]) # OUTPUT: ['123', '45a', 'bcd', 'e67', '89f', 'ghi'] print(["".join(c) for c in mit.windowed(s, n, step=3)]) # OUTPUT: ['123', '45a', 'bcd', 'e67', '89f', 'ghi']
Method 7: Using islice from Itertools
Here’s another (not so Pythonic) solution that uses an itertools
function known as islice
to solve the given problem.
Code:
from itertools import islice s = "12345abcde6789fghi" n = 3 def split_fun(n, iterable): i = iter(iterable) piece = list(islice(i, n)) while piece: yield ''.join(piece) piece = list(islice(i, n)) print(list(split_fun(n, list(s)))) # OUTPUT: ['123', '45a', 'bcd', 'e67', '89f', 'ghi']
Note: This iterator function islice
is used to print the values mentioned in its iterable container selectively.
πWant to learn about the yield keyword in Python? Read this comprehensive guide: Yield Keyword in Python β A Simple Illustrated Guide
Conclusion
We have successfully solved the given problem using different approaches. I hope you enjoyed this article and it helps you in your to become a better coder. Please subscribe and stay tuned for more interesting articles and solutions.
Check out my new Python book Python One-Liners (Amazon Link).
If you like one-liners, you’ll LOVE the book. It’ll teach you everything there is to know about a single line of Python code. But it’s also an introduction to computer science, data science, machine learning, and algorithms. The universe in a single line of Python!
The book was released in 2020 with the world-class programming book publisher NoStarch Press (San Francisco).
Publisher Link: https://nostarch.com/pythononeliners