(Unicode Error) ‘unicodeescape’ codec can’t decode bytes in position 2-3: truncated \UXXXXXXXX escape

(unicode error) 'unicodeescape' codec can't decode bytes in position 2-3

Quick Video Walkthrough

Have you come across this error – (Unicode error) ‘unicodeescape’ codec can’t decode bytes in position 2-3: truncated \UXXXXXXXX escape? It might be really frustrating because the logic might seem to be fone yet you got an error. Don’t worry! I got you covered and we will soon discover the ways to avoid/eliminate this error.

But, first, we must know what Unicode and Unicode escape is and what is a Unicode error.

What is Unicode and Encoding with utf-8?

Unicode is a standard that encourages character encoding utilizing variable piece encoding. There’s a high chance that you have heard about ASCII if you are into computer programming. ASCII addresses 128 characters while Unicode characterizes 221 characters. Along these lines, Unicode can be viewed as a superset of ASCII.

The way of converting over comprehensible data (easily read by humans) into a specified format, for the secure transmission of the data, is known as encoding. In Python, encode() is an inbuilt function utilized for encoding. If no encoding is indicated, then UTF-8 is utilized as default.

When does (Unicode error) ‘unicodeescape’ codec can’t decode bytes occur?

Example 1: Let’s consider that you are trying to open a file through the codecs module with utf-8

import codecs
f = codecs.open('C:\Users\SHUBHAM SAYON\PycharmProjects\Finxter\General\data.txt', "w",  encoding = "utf-8")
f.write('να έχεις μια όμορφη μέρα')
f.close()

Output:

File "C:\Users\SHUBHAM SAYON\PycharmProjects\Finxter\Errors\Unicode Escape Error.py", line 2
    f = codecs.open('C:\Users\SHUBHAM SAYON\PycharmProjects\Finxter\General\data.txt', "w",  encoding = "utf-8")
                                                                                     ^
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \UXXXXXXXX escape

Example 2:

import csv
d = open("C:\Users\SHUBHAM SAYON\PycharmProjects\Finxter\General\data.csv")
d = csv.reader(d)
print(d)

Output:

 File "C:\Users\SHUBHAM SAYON\PycharmProjects\Finxter\Errors\Unicode Escape Error.py", line 2
    d = open("C:\Users\SHUBHAM SAYON\PycharmProjects\Finxter\General\data.csv")
                                                                              ^
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \UXXXXXXXX escape

How frustrating! But do you know that a slight change in a single line will solve your problem. So, without further ado, let’s dive into the fixes.

Fix: Prefix the Path String with “r” or Use Double Backslashes “//” or Use Single Forwardslash “\”

The Unicode error-unicodeescape usually occurs because the problem lies in the string that denotes your file Path. We can solve this error either by either duplicating the backslashes or producing a raw string. To produce the raw string, we need to prefix the string with r

FIX 1- Duplicating the backlashes

In Python, the first backslash in the string gets interpreted as a special character, and the fact that it is followed by a U (U in Users) gets interpreted as the beginning of a Unicode code point. To fix this you need to duplicate the backslashes (by doubling the backslashes) in the string

# Example 1
import codecs
f = codecs.open('C:\\Users\\SHUBHAM SAYON\\PycharmProjects\\Finxter\\General\\data.txt', "w",  encoding = "utf-8")
f.write('να έχεις μια όμορφη μέρα')
f.close()

# Example 2
import csv
d = open("C:\\Users\\SHUBHAM SAYON\\PycharmProjects\\Finxter\\General\\data.csv")
d = csv.reader(d)
print(d)

FIX 2- Using Forwardslash

Another way to deal with it is to use the forwardslash character (/) to fix the error as follows:

# Example 1
import codecs
f = codecs.open('C:/Users/SHUBHAM SAYON/PycharmProjects/Finxter/General/data.txt', "w",  encoding = "utf-8")
f.write('να έχεις μια όμορφη μέρα')
f.close()

#Example 2
import csv
d = open("C:/Users/SHUBHAM SAYON/PycharmProjects/Finxter/General/data.csv")
d = csv.reader(d)
print(d)

FIX 3- Prefix the String with “r”

You just need to add an “r” before the path link to solve the Unicode escape error as follows:

# Example 1
import codecs
f = codecs.open(r'C:\Users\SHUBHAM SAYON\PycharmProjects\Finxter\General\data.txt', "w",  encoding = "utf-8")
f.write('να έχεις μια όμορφη μέρα')
f.close()

#Example 2
import csv
d = open(r"C:\Users\SHUBHAM SAYON\PycharmProjects\Finxter\General\data.csv")
d = csv.reader(d)
print(d)

When we add ‘r’ before the file path, the Python interpreter gets instructed to instead treat the string as a raw literal.

How to know if a string is valid utf-8 or ASCII?

In Python 3, str(string) is a sequence of bytes. It does not know what its encoding is. Hence, the Unicode type is the better way to store a text. 

In Python versions less than 3, to check whether it’s an utf-8 or ASCII, we can call the decode method. If the decode method raises a UnicodeDecodeError exception, it is not valid.

Scanning the File Path eliminates the possibility of an error

We can solve the 'unicodeescape' codec that can't decode bytes error by scanning the file path before running it. Mostly the developers know which path they are looking for and hence checking it beforehand helps eliminate the possibility of an error.

How to list the elements from any folder?

Let’s suppose we have to list the elements from any folder. For this purpose, we can use the os module in Python. The os.listdir method from the module helps to list all the strings (In this case, the path filenames.)

Example: Let’s check the general folder and its contents –

import os

pth = r"C:\Users\SHUBHAM SAYON\PycharmProjects\Finxter\General"
files = os.listdir(pth)
for file in files:
    print(file)

Output:

check_empty_string.py
data.csv
data.txt
logical and in Python.py
remove_multiple_spaces_string.py
rough.py
user_input_stdin.py

Conclusion

In this article, we learned different ways, i.e., Using backlash and forward slash characters, Using prefix ‘r’ to solve the Error – (Unicode error) ‘unicodeescape’ codec can’t decode bytes in position 2-3: truncated \UXXXXXXXX escape. I hope this tutorial helped to answered your queries. Please stay tuned and subscribe for more such articles.

Related Article: Python Unicode Encode Error


Finxter Computer Science Academy

  • One of the most sought-after skills on Fiverr and Upwork is web scraping. Make no mistake: extracting data programmatically from websites is a critical life skill in today’s world that’s shaped by the web and remote work.
  • So, do you want to master the art of web scraping using Python’s BeautifulSoup?
  • If the answer is yes – this course will take you from beginner to expert in Web Scraping.