How to Convert a String to Binary in Python

Rate this post
  • Summary: You can convert a string to binary in Python using:
    • bytearray() + bin()
    • map() + bin() + bytearray()
    • join() + format() + bytearray()
    • join() + format() + ord()
    • binascii.hexlify()

Problem: How to convert a given string to its binary equivalent in Python?

Example: When you convert a string to binary, it either results in a list consisting of binary values that represent the original characters of the given string, or it represents an entire binary value representing the whole string. 

Input:
given_string = "xyz"

Expected output:
['0b1111000', '0b1111001', '0b1111010']
or
1111000 1111001 1111010

Let’s dive into numerous approaches that will help us to derive the required output. We will dive deep into each function used to solve the mission-critical question. Hence, without further delay, let the games begin.

Video Walkthrough

Related Read: Convert Bytes to String

Method 1: Using bytearray + bin

Approach:

  • Convert the given string to a bytearray object by calling bytearray(string, encoding). The bytearray object represents the string characters as bytes.
  • Use a for-loop to iterate across each byte and use the bin method upon each byte to convert it into its binary representation.
  • Append the resultant binary representations in another list.

code:

word = "xyz"
# convert string to bytearray
byte_arr = bytearray(word, 'utf-8')
res = []
for byte in byte_arr:
    binary_rep = bin(byte)  # convert to binary representation
    res.append(binary_rep)  # add to list
print(res)

Output:

['0b1111000', '0b1111001', '0b1111010']

🔋Removing the “0b” Prefix:

The above method led to the creation of binary values with the prefix “0b“, which indicates that it is a number represented in the binary system and not the decimal system. Since you already know that the output is a binary number hence, you can eliminate the prefix by slicing the binary number and starting with index 2 on the binary string.

You can further join all the binary strings together using the join method to get the binary representation of the entire string at once.

Code:

word = "xyz"
# convert string to bytearray
byte_arr = bytearray(word, 'utf-8')
res = []
for byte in byte_arr:
    binary_rep = bin(byte)  # convert to binary representation
    res.append(binary_rep[2:])  # remove prefix "0b" and add to list
print(' '.join(res))  # join all the binaries of res list

Output:

1111000 1111001 1111010

💡Readers Digest

Python’s built-in bytearray() method takes an iterable such as a list of integers between 0 and 256, converts them to bytes between 00000000 and 11111111, and returns a new array of bytes as a bytearray class.

Python’s built-in bin(integer) function takes one integer argument and returns a binary string with prefix "0b". If you call bin(x) on a non-integer x, it must define the __index__() method that returns an integer associated to x. Otherwise, it’ll throw a TypeError: object cannot be interpreted as an integer.

Recommended Read: Python Print Binary Without ‘0b’

Method 2: Using map()+bin()+bytearray()

Approach: The following solution is a one-liner. Let’s break down and try to understand each segment of the one-liner that will be used:

  • Use the map function to pass the byte object to the bin function. The bytearray(string, encoding) converts the string to a byte object.
  • When each byte from the byte object is passed to the bin function, it converts them into their binary equivalents.
  • Convert the object returned by the map method to a list using the list constructor.
  • To generate a single binary string that represents the entire string use a list comprehension such that:
    • The expression is x[2:] which represents the binary string starting from index 2 to avoid the binary prefix “0b“.
    • The context variable,i.e., x represents each item/binary value within the list that we generated from the map object.
  • Finally, use the ' '.join method to get the binary representation on the above list comprehension of the entire string at once.

Code:

word = "xyz"
res = ' '.join([x[2:] for x in list(map(bin, bytearray(word, 'utf-8')))])
print(res)

Output:

1111000 1111001 1111010

💡Readers Digest

The map() function transforms one or more iterables into a new one by applying a “transformator function” to the i-th elements of each iterable. The arguments are the transformator function object and one or more iterables. If you pass n iterables as arguments, the transformator function must be an n-ary function taking n input arguments. The return value is an iterable map object of transformed, and possibly aggregated, elements.

Method 3: Using join+format+bytearray

Approach:

  • Use the bytearray function to convert the given string to a byte object such that the byte object represents each character of the string as bytes.
  • Then call the format(x, 'b') method to convert the byte object to its binary representation and then join each converted character using the join method to form a string.

Code:

word = "xyz"
res = ' '.join(format(x, 'b') for x in bytearray(word, 'utf-8'))
print(res)

Output:

1111000 1111001 1111010

💡Readers Digest

Python’s built-in format(value, spec) function transforms input of one format into output of another format defined by you. Specifically, it applies the format specifier spec to the argument value and returns a formatted representation of value. For example, format(42, 'f') returns the string representation '42.000000'.

str.join(iterable) concatenates the elements in an iterable. The result is a string where each element in the iterable are “glued together” using the string on which it is called a delimiter.

Method 4: Using join()+format()+ord()

Approach: The idea here is quite similar to the approach explained above. The only difference, in this case, is instead of bytearray, the ord function is used to convert the characters of the given string to their Unicode representation.

Code:

word = "xyz"
res = ' '.join(format(ord(x), 'b') for x in word)
print(res)

Output:

1111000 1111001 1111010

💡Readers Digest

The Python ord() function takes a character (=string of length one) as an input and returns the Unicode number of this character. For example, ord('a') returns the Unicode number 97. The inverse function of ord() is the chr() function, so chr(ord('a')) returns the original character 'a'.

Method 5: Using hexlify

Approach:

  • Call the bytes(word, 'utf-8') function to convert the given string to a bytes object.
  • Use binary.hexlify to return the hexadecimal representation of the binary data and then convert it to an integer object by specifying 16 as its base.
  • Finally, convert it to its binary representation with the bin function.

Code:

import binascii
word = "xyz"
w = bytes(word, 'utf-8')
res = bin(int(binascii.hexlify(w),16))
print(res[2:])

Output:

11110000111100101111010

Conclusion

Phew! That was indeed a comprehensive journey and we learned as many as five different ways to conquer the given problem. Please feel free to try them out and use the one that suits you.

Subscribe and stay tuned for more interesting tutorials. Happy learning! 🙂