In the previous article, we got acquainted with Unicode and methods of processing input Unicode strings, different ways of processing and converting them into a readable form – string objects in Python.
Let’s look at ways of converting to other types of output data and applying different encodings to them.
♥️ Info: Are you AI curious but you still have to create real impactful projects? Join our official AI builder club on Skool (only $5): SHIP! - One Project Per Month
Problem Formulation
Suppose we need to send data in the form of characters represented as integers int.
Function ord.
The built-in function ord() takes a Unicode string character as an argument and returns an int, the value of the Unicode code point.
A = '\u0048' >>>print(ord(A)) # 72
If the argument consists of two or more characters, a TypeError will appear:
B = '\u0048u0065u006Cu006Cu006F' >>>print(ord(B)) # TypeError: ord() expected a character, but string of length 5 found
To avoid this, let’s use a list generator in combination with the map function, the first argument of which is an int function, and the second is an iterable composite object – in our case, a list:
>>>print(list(map(int, [ord(i) for i in B]))) # [72, 101, 108, 108, 111]
Checking the data type:
>>>B_list = list(map(int, [ord(i) for i in B]))
>>>print(type(B_list{0]))
# <class 'int'>
You can use a for loop and immediately check the data type of each character:
>>>for i in B: print(ord(i), type(ord(i)), end=' ') # 72 <class 'int'> # 101 <class 'int'> # 108 <class 'int'> # 108 <class 'int'> # 111 <class 'int'>
Python Convert Unicode to Float
Similar to the task described above, it is sometimes necessary to convert a Unicode string to float numbers.
Function ord.
Using ord(), but already wrapping it with a float function, we will get the desired result, provided that the length of the Unicode string does not exceed one character:
A = '\u0048' >>>print(float(ord(A))) # 72.0
If the argument consists of two or more characters, a TypeError will be thrown, but we already know how to avoid it – we will use the list method:
>>>print(list(map(float, [ord(i) for i in B]))) # [72.0, 101.0, 108.0, 108.0, 111.0]
Or we can use a for loop, and the data type of each character will be float, since we explicitly indicated to convert to this type:
>>>for i in B: print(float(ord(i)), sep=' ') # 72.0 101.0 108.0 108.0 111.0