Problem Formulation: Given a byte string that contains new-line characters
'\n'. How to split the byte string into a list of lines?
Example: You want to transform the byte string
b'your\nbyte\nstring' into the list of byte strings
[b'your', b'byte', b'string'] using
b'\n' as a newline separator.
Given: b'your\nbyte\nstring' Goal: [b'your', b'byte', b'string']
Solution: To split a byte string into a list of lines—each line being a byte string itself—use the
Bytes.split(delimiter) method and use the Bytes newline character
b'\n' as a delimiter.
>>> s = b'your\nbyte\nstring' >>> s.split(b'\n') [b'your', b'byte', b'string']
Bytes objects look just like strings but are prefixed with the
b symbol to indicate that they’re different to strings. Like strings, they are immutable sequences of single characters. However, in contrast to strings, the characters consist only of a single byte rather than multiple bytes. Thus, they’re based on ASCII encoding rather than based on more modern Unicode encoding.
Convert Bytestring to String and Split String
An alternative is to convert the byte string to a normal string first and then use the
string.split() method on the converted data structure. In many cases, this is the recommended way because it ensures that you use modern encoding.
>>> s = b'your\nbyte\nstring' >>> s = s.decode() >>> s.split('\n') ['your', 'byte', 'string']
Note that you need to use a byte string as a delimiter or Python will throw a
TypeError: a bytes-like object is required, not 'str'
>>> s = b'your\nbyte\nstring' >>> s.split('\n') Traceback (most recent call last): File "<pyshell#24>", line 1, in <module> s.split('\n') TypeError: a bytes-like object is required, not 'str'
The fix is to use the Bytes delimiter
b'\n' as shown before:
>>> s.split(b'\n') [b'your', b'byte', b'string']
While working as a researcher in distributed systems, Dr. Christian Mayer found his love for teaching computer science students.
To help students reach higher levels of Python success, he founded the programming education website Finxter.com that has taught exponential skills to millions of coders worldwide. He’s the author of the best-selling programming books Python One-Liners (NoStarch 2020), The Art of Clean Code (NoStarch 2022), and The Book of Dash (NoStarch 2022). Chris also coauthored the Coffee Break Python series of self-published books. He’s a computer science enthusiast, freelancer, and owner of one of the top 10 largest Python blogs worldwide.
His passions are writing, reading, and coding. But his greatest passion is to serve aspiring coders through Finxter and help them to boost their skills. You can join his free email academy here.