How to Clean and Format Phone Numbers in Python

4.6/5 - (11 votes)

Problem Formulation and Solution Overview

This article will show you how to create, clean up, and format phone numbers in Python.

To make it more interesting, we have the following scenario.

Right-On Realtors has a Contact Us form on their main website containing several fields. One of these fields is a Phone Number. Upon submission, this phone number will most likely require a clean-up so all phone numbers are in the same format. In most cases, this is done to add to a database or perhaps a CRM system, such as SalesForce.

This article covers formatting phone numbers for both EU and the US.


πŸ’¬ Question: How would we write code to clean up and format a phone number?

We can accomplish this task by one of the following options:


Method 1: Use Slicing

This example uses Python’s built-in slicing method to extract and format phone numbers using EU and US fictitious phone numbers.

orig_phone = '+44797 5377 666'.replace(' ', '')

if orig_phone[0] != '+': 
    orig_phone = f'+{orig_phone}'

if ((orig_phone[1:].isnumeric()) and (len(orig_phone)) == 13: 
    new_phone = f'{orig_phone[0:3]} {orig_phone[3:7]} {orig_phone[7:]}'
    print(new_phone)
else:
   print('Invalid Phone Number.') 

EU Example

The above example takes an EU phone number and removes any empty spaces that may occur using the replace() method. This ensures that any spaces occur where we want them to. The results save to orig_phone.

πŸ’‘Note: If the phone number does not contain one or more of the stated characters to replace, no change is made.

The following if statement uses slicing to check that the first character in orig_phone contains a plus sign (+). If not, one is added. If output to the terminal, the following displays.

+447975377666

The following line formats this phone number, placing spaces in the desired locations as follows:

+44 7975 377666

This is done by using the code below.

{orig_phone[0:3]}Country Code: This portion uses slicing to extract the Country Code from orig_phone. This results in +44. Then a space is applied.
{old_phone[3:7]}National Destination Code: This portion uses slicing to extract the National Destination Code. This results in 7975. Then a space is applied.
{old_phone[7:]}Subscriber Number: This portion uses slicing to extract the Subscriber Number. This results in 377666.

This phone number is now formatted perfectly to meet Right-On Realtors’ expectations and can be easily added to their database or CRM software.

πŸ’‘Note: Depending on your location, this code above may need to be modified.

Another example would be to add additional validation to ensure the phone number is indeed numeric and the length of the phone number matches the expected size, in this example, 13 (without spaces).

orig_phone = '+44797 5377 666'.replace(' ', '')

if orig_phone[0] != '+': 
    orig_phone = f'+{orig_phone}'

if ((orig_phone[1:].isnumeric()) and (len(orig_phone) == 13)): 
    new_phone = f'{orig_phone[0:3]} {orig_phone[3:7]} {orig_phone[7:]}'
    print(new_phone)
else:
   print('Invalid Phone Number.') 

US/CDN Example

The United States and Canada have different phone number formats from EU, such as:

phone1 = '814-552-1212'
phone2 = '814.553.1212'
phone3 = '(814) 554-1212'

These phone numbers may or may not include a 1 at the start (for example, 1-814-555-1212), indicating long distance.

For this example, long distance is ignored, and the phone number will be formatted as follows: (814) 555-1212. As this format is the standard for both the United States and Canada.

For this example, only one (1) phone number from the above is used.

orig_phone = '814.553.1212'
new_phone = orig_phone.replace(' ', '').replace('-', '').replace('.', '').replace('(', '').replace (')', '')

if ((new_phone.isnumeric()) and (len(new_phone) == 10)):
    print(f'({new_phone[:3]}) {new_phone[3:6]}-{new_phone[6:]}')
else:
    print('Invalid Phone Number.')

The above declares a US/Canadian phone number and saves it to orig_phone.

The following line removes any offending characters using the replace() method. This ensures that only numeric phone digits are left. The results save to new_phone. If output to the terminal, the following would display.

8145531212

An if statement is declared and checks that all digits are numeric and that the length of the phone number is the required size of 10. If so, this phone number is formatted and output to the terminal.

(814) 553-1212

If not, the following error messages is output to the terminal.

Invalid Phone Number.

This phone number is now formatted perfectly to meet Right-On Realtors’ expectations and can be easily added to their database or CRM software.

πŸ’‘Note: If you would like to include the long-distance number (1), then this code above will need to be modified.

The Ultimate Guide to Slicing in Python

Method 2: Use phonenumbers

This examples imports and uses the phonenumbers library to format and validate a phone number. This library is taken from the Google libphonenumber library.

To follow along, the phonenumbers library must be installed before moving forward. Click here for installation instructions.

GB Example

import phonenumbers as pn

orig_phone = '+447975777666'

new_phone = pn.format_number(pn.parse(orig_phone, 'GB'),
                             pn.PhoneNumberFormat.NATIONAL)

print(new_phone)

The first line in the above code imports the phonenumbers library. This allows us to quickly format a phone number based on a selected Country.

The following line declares a phone number and saves it to orig_phone.

The next line calls the phonenumbers format_number() function and pass it one (1) arguments, parse(). This function is passed three (3) arguments:

  • The variable orig_phone declared earlier.
  • The Country. In this case GB.
  • The expected format.

The last line outputs the formatted phone number to the terminal.

07975 777666

US/Canada Example

import phonenumbers as pn

orig_phone = '814.553.1212'

new_phone = pn.format_number(pn.parse(orig_phone, 'US'),
                             pn.PhoneNumberFormat.NATIONAL)

print(new_phone)

The first line in the above code imports the phonenumbers library. This allows us to quickly format a phone number based on a selected Country.

The following line declares a phone number and saves it to orig_phone.

The next line calls the phonenumbers format_number() function and pass it one (1) arguments, parse(). This function is passed three (3) arguments:

  • The variable orig_phone declared earlier.
  • The Country. In this case US.
  • The expected format.

The last line outputs the formatted phone number to the terminal.

(814) 553-1212

Method 3: Use regex

This example uses regex to format and validate a phone number.

US/Canada Example

Another way to format a US/Canadian phone number is by using regex.

import re

orig_phone = '814 553 1212'
print('(%s) %s-%s' % tuple(re.findall(r'\d{4}$|\d{3}', orig_phone)))

The first line in the above code imports the regex library.

The following line declares a US/Canadian phone number. The results save to orig_phone.

The last line, extracts the phone number, validates and formats into the acceptable US/Canadian format.

This is then output to the terminal.

(814) 553-1212
5 must have skills to become a programmer (that you didn't know)

GB Example

import re 

valid_phone = "^\\+?[1-9][0-9]{7,14}$"
print(re.findall(valid_phone, '+44797577766'))

The first line in the above code imports the regex library.

The following lines validate and output the phone number if it passes the validation test. Otherwise, empty square brackets return.

['+44797577766']

πŸ’‘Note: Due to the complexity of regex, may we suggest viewing the above video or checking out our in-depth “REGEX SUPERPOWER GUIDE πŸš€” to learn more.


Summary

This article has provided three (3) ways to clean and format phone numbers to select the best fit for your coding requirements.

Good Luck & Happy Coding!


Programming Humor – Python

“I wrote 20 short programs in Python yesterday. It was wonderful. Perl, I’m leaving you.”xkcd