How to Remove ‘\x’ From a Hex String in Python?

4.7/5 - (3 votes)

Problem Formulation + Examples

πŸ’¬ Question: Given a string of hexadecimal encoded string such as '\x00\xff\xf2'. How to remove the '\x' prefixes or characters from the hex string?

Here are a few examples of what you want to accomplish:

Hex StringDesired Output
'\x00\xff\xf2''00fff2'
'\x41\x42\x43''414243'
'\x53''53'
'\xff\xff\xff\xff\xff''ffffffffff'
'\x00\x6a\x6f''006a6f'

Strawman Solutions That Don’t Work

Before I show you how to solve this approach, let me first give you the “strawman solution” that you find everywhere on the web (like here) when searching for the right way to accomplish this: using the hex_string.encode('hex') method on the hex_string.

While this may have worked back in the year 2015, it doesn’t work anymore but leads to a LookupError: 'hex' is not a text encoding; use codecs.encode() to handle arbitrary codecs.

>>> '\xff\x1f\x00\xe8'.encode("hex")
Traceback (most recent call last):
  File "<pyshell#18>", line 1, in <module>
    '\xff\x1f\x00\xe8'.encode("hex")
LookupError: 'hex' is not a text encoding; use codecs.encode() to handle arbitrary codecs

You can fix the error message by actually using the codecs.encode() method—however, it doesn’t give you the result you seek:

>>> import codecs
>>> codecs.encode(hex_str)
b'\x00\xc3\xbf\xc3\xb2'

Also, the “trivial” approaches such as string.replace() or string.translate() don’t work because the '\xXX' is actually not a sequence of characters but a single character encoded in the hexadecimal Unicode form—you cannot simply replace a part of a single character!

>>> '\xff\x1f\x00\xe8'.replace('\x', '')
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 0-1: truncated \xXX escape
>>> '\xff\x1f\x00\xe8'.replace('\\x', '')
'ΓΏ\x1f\x00Γ¨'
>>> '\xff\x1f\x00\xe8'.replace(r'\x', '')
'ΓΏ\x1f\x00Γ¨'

Solutions like this don’t work because \x is not actually contained in the string but is part of a representation of a single character. The length of such a sequence is often smaller than expected:

>>> len('\x00\x6a\x6f')
3

Without further ado, let’s dive into what actually works to solve your problem!

Solution That Does Work

The most Pythonic solution to remove '\x' from a hex string s is to use the one-liner expression ''.join(f'{ord(c):02x}' for c in s) that uses the join() method to change each character in a generator comprehension using f-strings.

Here’s a simple example:

s = '\x00\xff\xf2'
res = ''.join(f'{ord(c):02x}' for c in s)
print(res)
# 00fff2

This approach uses the following techniques:

  • The ''.join() method glues together a bunch of characters, so we can focus on the reformatting of one character at a time.
  • We iterate over all characters of the hex string s using a generator expression XXX for c in s.
  • Now, we only need to find an expression XXX to convert a character such as \x01 to the sequence 01.
  • The f-string expression f'{ord(c):02x}' does exactly that. The ord() function returns an integer value associated with the specified character according to the Unicode table.
  • The :02x format specifier sets the output format to have two digits right-filled with 0 and using a hexadecimal representation of each digit ('x')

If you don’t like the fancy Python features used here and are willing to invest a couple more lines of code, you can also use a simple for loop:

Solution with For Loop

Iterate over each character in a string and convert it to a representation without \x by using the f-string expression f'{ord(c):02x}'. Use string concatenation to build your resulting hex string without \x one character at a time.

s = '\x00\xff\xf2'
res = ''
for c in s:
    res += f'{ord(c):02x}'
print(res)
# 00fff2

Still Not Done Learning?

Great! Thanks for reading through the whole tutorial! ❀️

If you want to keep improving your coding skills, feel free to join my email academy and download your Python cheat sheets for maximum learning efficiency. It’s fun too. πŸ˜‰