Problem Formulation and Solution Overview
This article references an Albanian proverb written by Driton Selmani in 2012. We will leave the interpretation up to you.
Preparation
Add the following code to the top of each code snippet. This snippet will allow the code in this article to run error-free.
import re
Method 1: Use List Comprehension and isdigit()
You can use List Comprehension
and isdigit()
to extract, convert and return a list of positive numbers found in a string txt
using the expression [int(s) for s in txt.split() if s.isdigit()]
that returns a List of Integers.
Here’s an example:
txt = "One can't hold 2 watermelons in 1 hand: by Driton Selmani, 2012" nums = [int(s) for s in txt.split() if s.isdigit()] print(nums)
This code creates the variable txt
that holds the proverb indicated above.
Next, List Comprehension
evaluates each string element. If the element contains a number, this element is extracted (txt.split()
), converted to an integer (int(s)
) and appended to nums
. Once all elements have been evaluated, the contents of nums
is output to the terminal.
Output – a List of Integers
[2, 1, 2012] |
Method 2: Use List Comprehension and join()
Another Pythonic way is to use a ternary expression, List Comprehension
and join()
to extract, convert, and return a list of positive numbers found in a string. This method returns a List of Integers.
txt = "One can't hold 2 watermelons in 1 hand: by Driton Selmani, 2012" tmp = ''.join(c if c in '0123456789' else ' ' for ch in txt) nums = [int(i) for i in tmp.split()] print(nums)
This code creates the variable txt
that holds the proverb indicated above.
Next, join()
(an iterable) evaluates each string element.
- If an element is found in the sub-string (
'0123456789'
), the element is then concatenated totmp
as is. - If not, the element is replaced with a space (
' '
) character and concatenated totmp
.
If the contents of tmp
was output to the terminal at this point, it would display as follows with all other non-number elements converted to spaces.
Interim Output
2 1 2012 |
Then, List Comprehension
is used to navigate through the contents of tmp
, converting each element to an integer (int()
) and appending it to nums
(effectively removing spaces).
The contents of nums
is output to the terminal as a List of Integers.
Output – a List of Integers
[2, 1, 2012] |
Also, you may want to recap the basics of the ternary operator because it’s used in the first highlighted line in the above code snippet:
Method 3: Use Regex
In this example, Regex
is used to extract all positive numbers from a string. This method returns a List of Strings.
txt = "One can't hold 2 watermelons in 1 hand: by Driton Selmani, 2012" nums = re.findall(r'\b\d+\b', txt) print(nums)
βA Finxter Favorite!
This code creates the variable txt
that holds the proverb indicated above.
Next, a regex (re.findall()
) is used to create and extract all positive numbers from the string passed as a parameter.
In short, the \d+
notation lets Regex
know to search the string for all occurrences of one (1) or more digits and extract them. The result of this extraction saves to nums as a List of Strings.
Output – a List of Strings
['2', '1', '2012'] |
Method 4: Use a For Loop
This example uses a For
loop to traverse the string elements, checking for the existence of a positive number (c.isdigit()
). If found, it is converted to an integer and appended to nums. This method returns a List of Integers.
txt = "One can't hold 2 watermelons in 1 hand: by Driton Selmani, 2012" nums = [] for c in txt.split(): if c.isdigit(): nums.append(int(c)) print(nums)
This code creates the variable txt
that holds the proverb indicated above. and nums
, a list that will contain all the numbers found in the string.
Next, a For
loop is instantiated to traverse through each string element, checking for the existence of a number. If found, the element is converted to an integer(int(c)
) and appended to nums
.
The result of this extraction saves to nums as a List of Integers.
Output – a List of Integers
[2, 1, 2012] |
Bonus: Extract Positive or Negative Numbers
What happens if you need to extract negative and positive numbers? The above examples won’t give you the results you need. But, using regex.compile()
and regex.findall()
will!
txt = "The 3rd equation resulted in -745.093." regex = re.compile(r'[\+\-]?[0-9]+') nums = [int(k) for k in regex.findall(txt)] print(nums)
This code creates a string, txt
containing a positive and negative number.
Next, the re.compile()
method is called. This method returns a regular expression object from the pattern passed. In this case, we have told the method to extract all negative or positive numbers ([+-]?[0-9]+
).
This object saves to regex
.
A List Comprehension is used to loop and search for the occurrences of any positive or negative numbers, converting them to an integer (int(k)
), and appending to nums
. The result of this extraction saves to nums as a List of Integers.
Output – a List of Integers
[3, -745, 93] |
Summary
These five (5) methods of extracting numbers from a string should give you enough information to select the best one for your coding requirements.
Good Luck & Happy Coding!