How to Extract Numbers from a String

Rate this post

Problem Formulation and Solution Overview

In this article, you’ll learn how to extract numbers from a string in Python.

To make it more fun, we have the following running scenario:

This article references an Albanian proverb written by Driton Selmani in 2012. We will leave the interpretation up to you.

💬 Question: How would we write Python code to extract numbers from a String?

We can accomplish this task by one of the following options:


Preparation

Add the following code to the top of each code snippet. This snippet will allow the code in this article to run error-free.

import re

Method 1: Use List Comprehension and isdigit()

You can use List Comprehension and isdigit() to extract, convert and return a list of positive numbers found in a string txt using the expression [int(s) for s in txt.split() if s.isdigit()] that returns a List of Integers.

Here’s an example:

txt  = "One can't hold 2 watermelons in 1 hand: by Driton Selmani, 2012"
nums =  [int(s) for s in txt.split() if s.isdigit()]
print(nums)

This code creates the variable txt that holds the proverb indicated above.

Next, List Comprehension evaluates each string element. If the element contains a number, this element is extracted (txt.split()), converted to an integer (int(s)) and appended to nums. Once all elements have been evaluated, the contents of nums is output to the terminal.

Output – a List of Integers

[2, 1, 2012]

Method 2: Use List Comprehension and join()

Another Pythonic way is to use a ternary expression, List Comprehension and join() to extract, convert, and return a list of positive numbers found in a string. This method returns a List of Integers.

txt  = "One can't hold 2 watermelons in 1 hand: by Driton Selmani, 2012"
tmp  = ''.join(c if c in '0123456789' else ' ' for ch in txt)
nums = [int(i) for i in tmp.split()]
print(nums)

This code creates the variable txt that holds the proverb indicated above.

Next, join() (an iterable) evaluates each string element.

  • If an element is found in the sub-string ('0123456789'), the element is then concatenated to tmp as is.
  • If not, the element is replaced with a space (' ') character and concatenated to tmp.

If the contents of tmp was output to the terminal at this point, it would display as follows with all other non-number elements converted to spaces.

Interim Output

2 1 2012

Then, List Comprehension is used to navigate through the contents of tmp, converting each element to an integer (int()) and appending it to nums (effectively removing spaces).

The contents of nums is output to the terminal as a List of Integers.

Output – a List of Integers

[2, 1, 2012]

Also, you may want to recap the basics of the ternary operator because it’s used in the first highlighted line in the above code snippet:


Method 3: Use Regex

In this example, Regex is used to extract all positive numbers from a string. This method returns a List of Strings.

txt  = "One can't hold 2 watermelons in 1 hand: by Driton Selmani, 2012"
nums = re.findall(r'\b\d+\b', txt)
print(nums)

⭐A Finxter Favorite!

This code creates the variable txt that holds the proverb indicated above.

Next, a regex (re.findall()) is used to create and extract all positive numbers from the string passed as a parameter.

In short, the \d+ notation lets Regex know to search the string for all occurrences of one (1) or more digits and extract them. The result of this extraction saves to nums as a List of Strings.

Output – a List of Strings

['2', '1', '2012']

Method 4: Use a For Loop

This example uses a For loop to traverse the string elements, checking for the existence of a positive number (c.isdigit()). If found, it is converted to an integer and appended to nums. This method returns a List of Integers.

txt  = "One can't hold 2 watermelons in 1 hand: by Driton Selmani, 2012"
nums = []

for c in txt.split():
   if c.isdigit():
      nums.append(int(c))
print(nums)

This code creates the variable txt that holds the proverb indicated above. and nums, a list that will contain all the numbers found in the string.

Next, a For loop is instantiated to traverse through each string element, checking for the existence of a number. If found, the element is converted to an integer(int(c)) and appended to nums.

The result of this extraction saves to nums as a List of Integers.

Output – a List of Integers

[2, 1, 2012]

Bonus: Extract Positive or Negative Numbers

What happens if you need to extract negative and positive numbers? The above examples won’t give you the results you need. But, using regex.compile() and regex.findall() will!

txt  = "The 3rd equation resulted in -745.093."
regex = re.compile(r'[\+\-]?[0-9]+')
nums = [int(k) for k in regex.findall(txt)]
print(nums)

This code creates a string, txt containing a positive and negative number.

Next, the re.compile() method is called. This method returns a regular expression object from the pattern passed. In this case, we have told the method to extract all negative or positive numbers ([+-]?[0-9]+).

This object saves to regex.

A List Comprehension is used to loop and search for the occurrences of any positive or negative numbers, converting them to an integer (int(k)), and appending to nums. The result of this extraction saves to nums as a List of Integers.

Output – a List of Integers

[3, -745, 93]

Summary

These five (5) methods of extracting numbers from a string should give you enough information to select the best one for your coding requirements.

Good Luck & Happy Coding!