- Summary: You can convert a string to binary in Python using:
- bytearray() + bin()
- map() + bin() + bytearray()
- join() + format() + bytearray()
- join() + format() + ord()
- binascii.hexlify()
Problem: How to convert a given string to its binary equivalent in Python?
Example: When you convert a string to binary, it either results in a list consisting of binary values that represent the original characters of the given string, or it represents an entire binary value representing the whole string.
Input: given_string = "xyz" Expected output: ['0b1111000', '0b1111001', '0b1111010'] or 1111000 1111001 1111010
Let’s dive into numerous approaches that will help us to derive the required output. We will dive deep into each function used to solve the mission-critical question. Hence, without further delay, let the games begin.
Video Walkthrough
Related Read: Convert Bytes to String
Method 1: Using bytearray + bin
Approach:
- Convert the given string to a bytearray object by calling
bytearray(string, encoding)
. The bytearray object represents the string characters as bytes. - Use a for-loop to iterate across each byte and use the
bin
method upon each byte to convert it into its binary representation. - Append the resultant binary representations in another list.
code:
word = "xyz" # convert string to bytearray byte_arr = bytearray(word, 'utf-8') res = [] for byte in byte_arr: binary_rep = bin(byte) # convert to binary representation res.append(binary_rep) # add to list print(res)
Output:
['0b1111000', '0b1111001', '0b1111010']
🔋Removing the “0b” Prefix:
The above method led to the creation of binary values with the prefix “0b“, which indicates that it is a number represented in the binary system and not the decimal system. Since you already know that the output is a binary number hence, you can eliminate the prefix by slicing the binary number and starting with index 2 on the binary string.
You can further join all the binary strings together using the join
method to get the binary representation of the entire string at once.
Code:
word = "xyz" # convert string to bytearray byte_arr = bytearray(word, 'utf-8') res = [] for byte in byte_arr: binary_rep = bin(byte) # convert to binary representation res.append(binary_rep[2:]) # remove prefix "0b" and add to list print(' '.join(res)) # join all the binaries of res list
Output:
1111000 1111001 1111010
💡Readers Digest
Python’s built-in bytearray()
method takes an iterable such as a list of integers between 0 and 256, converts them to bytes between 00000000
and 11111111
, and returns a new array of bytes as a bytearray
class.
Python’s built-in bin(integer)
function takes one integer argument and returns a binary string with prefix "0b"
. If you call bin(x)
on a non-integer x
, it must define the __index__()
method that returns an integer associated to x
. Otherwise, it’ll throw a TypeError: object cannot be interpreted as an integer
.
Recommended Read: Python Print Binary Without ‘0b’
Method 2: Using map()+bin()+bytearray()
Approach: The following solution is a one-liner. Let’s break down and try to understand each segment of the one-liner that will be used:
- Use the
map
function to pass thebyte
object to thebin
function. Thebytearray(string, encoding)
converts the string to a byte object. - When each byte from the byte object is passed to the bin function, it converts them into their binary equivalents.
- Convert the object returned by the
map
method to a list using thelist
constructor. - To generate a single binary string that represents the entire string use a list comprehension such that:
- The expression is x[2:] which represents the binary string starting from index 2 to avoid the binary prefix “0b“.
- The context variable,i.e., x represents each item/binary value within the list that we generated from the map object.
- Finally, use the
' '.join
method to get the binary representation on the above list comprehension of the entire string at once.
Code:
word = "xyz" res = ' '.join([x[2:] for x in list(map(bin, bytearray(word, 'utf-8')))]) print(res)
Output:
1111000 1111001 1111010
💡Readers Digest
The map()
function transforms one or more iterables into a new one by applying a “transformator function” to the i-th elements of each iterable. The arguments are the transformator function object and one or more iterables. If you pass n iterables as arguments, the transformator function must be an n-ary function taking n input arguments. The return value is an iterable map object of transformed, and possibly aggregated, elements.
Method 3: Using join+format+bytearray
Approach:
- Use the
bytearray
function to convert the given string to a byte object such that the byte object represents each character of the string as bytes. - Then call the
format(x, 'b')
method to convert the byte object to its binary representation and then join each converted character using thejoin
method to form a string.
Code:
word = "xyz" res = ' '.join(format(x, 'b') for x in bytearray(word, 'utf-8')) print(res)
Output:
1111000 1111001 1111010
💡Readers Digest
Python’s built-in format(value, spec)
function transforms input of one format into output of another format defined by you. Specifically, it applies the format specifier spec
to the argument value
and returns a formatted representation of value
. For example, format(42, 'f')
returns the string representation '42.000000'
.
str.join(iterable)
concatenates the elements in an iterable
. The result is a string where each element in the iterable are “glued together” using the string on which it is called a delimiter.
Method 4: Using join()+format()+ord()
Approach: The idea here is quite similar to the approach explained above. The only difference, in this case, is instead of bytearray, the ord function is used to convert the characters of the given string to their Unicode representation.
Code:
word = "xyz" res = ' '.join(format(ord(x), 'b') for x in word) print(res)
Output:
1111000 1111001 1111010
💡Readers Digest
The Python ord()
function takes a character (=string of length one) as an input and returns the Unicode number of this character. For example, ord('a')
returns the Unicode number 97
. The inverse function of ord()
is the chr()
function, so chr(ord('a'))
returns the original character 'a'
.
Method 5: Using hexlify
Approach:
- Call the
bytes(word, 'utf-8')
function to convert the given string to a bytes object. - Use
binary.hexlify
to return the hexadecimal representation of the binary data and then convert it to an integer object by specifying 16 as its base. - Finally, convert it to its binary representation with the
bin
function.
Code:
import binascii word = "xyz" w = bytes(word, 'utf-8') res = bin(int(binascii.hexlify(w),16)) print(res[2:])
Output:
11110000111100101111010
Conclusion
Phew! That was indeed a comprehensive journey and we learned as many as five different ways to conquer the given problem. Please feel free to try them out and use the one that suits you.
Subscribe and stay tuned for more interesting tutorials. Happy learning! 🙂
🌍 Recommended Tutorial: Python Convert Hex String to Binary