Problem Formulation
Given a string s. Create a new string based on s with all control characters such as '\n' and '\t' removed.
What is a Control Character?
A control character, also called non-printing character (NPC), is a character that doesn’t represent a written symbol. Examples are the newline character '\n' and the tabular character '\t'. The inverse set of control characters are the printable characters.
In Unicode, control characters have the code pattern U+000 - 0U+001F, U+007F, and U+0080 - U+009F.
Solution Based on Unicode Category
The unicodedata module provides a function unicodedata.category(c) that returns the general category assigned to the character c as a string. The Unicode categories 'Cc', 'Cf', 'Cs', 'Co', and 'Cn' could be seen as “control characters”, although you could argue that only 'Cc' is a control character. In any case, you can customize our solution below based on your preferences.

Depending on your preferences, you’d obtain the Python one-liner ''.join(c for c in s if unicodedata.category(c)[0] != 'C') removes all control characters in the original string s.
Here’s the final code that removes all control characters from a string:
import unicodedata
def remove_control_characters(s):
return ''.join(c for c in s if unicodedata.category(c)[0] != 'C')
s = 'hello\nworld\tFinxters!'
print(s)
s = remove_control_characters(s)
print(s)
- The
join()function combines all characters in an iterable using the separator string on which it is called. In our case, we combine them on the empty string''. - The generator expression
c for c in s if unicodedata.category(c)[0] != 'C'goes over all characters that are not in a category starting with the uppercase'C'.
Alternatively, you can write it using a simple for loop like this:
import unicodedata
def remove_control_characters(s):
s_new = ''
for c in s:
if unicodedata.category(c)[0] != 'C':
s_new = s_new + c
return s_new
s = 'hello\nworld\tFinxters!'
print(s)
s = remove_control_characters(s)
print(s)
The output of both variants is:
# First print() statement before removal of control chars hello world Finxters! # Second print() statement after removal of control chars helloworldFinxters!
You can see that the second output doesn’t contain any control characters.