Parse JSON Data in Python

Problem Formulation

Do you have JSON data that you need to parse using a Python script? Let’s have a look at this JSON data –

{
    "maps": [
        {
            "id": "blabla",
            "iscategorical": "0"
        },
        {
            "id": "blabla",
            "iscategorical": "0"
        }
    ],
    "masks": [
        "id": "valore"
    ],
    "om_points": "value",
    "parameters": [
        "id": "valore"
    ]
}

But when you try to parse this file in your script, you get an exception. Frustrating! Isn’t it? Don’t worry. Most probably you have no errors in your script. The error is the JSON data itself.

So, in this tutorial we will be solving two problems –

  1. Why can’t Python parse this JSON data? [the one shown above]
  2. How to parse JSON data in Python?

Let’s answer the questions one by one. Please follow along to unearth the answers.

Why can’t Python parse the JSON data?

The error is probably not within your script. It is the JSON data that has been provided in the wrong format. You have square brackets,i.e., [] in line 12 where you should actually have {} braces.

NOTE:

  • [] are used to denote JSON arrays.
  • {} are used to denote JSON objects.

Now use the following code to use this JSON data.

import json
from pprint import pprint

with open('data.json') as f:
    data = json.load(f)

pprint(data)

Output:

{'maps': [{'id': 'blabla', 'iscategorical': '0'},
          {'id': 'blabla', 'iscategorical': '0'}],
 'masks': {'id': 'valore'},
 'om_points': 'value',
 'parameters': {'id': 'valore'}}

We have now dealt with our first problem. It is now time to deal with the second question. So, what if the JSON data is correct but you have no clue how to import and use it in your script. Let’s find out.

Reading a JSON File

Method 1: Using json.load()

Consider that we have the following JSON file in our project folder –

{
   "firstName": "Joe",
   "lastName": "Jackson",
   "gender": "male",
   "age": 28,
   "address": {
       "streetAddress": "101",
       "city": "San Diego",
       "state": "CA"
   },
   "phoneNumbers": [
       { "type": "home", "number": "7349282382" }
   ]
}

Approach: We can use the Python module named json that is used to encode and decode JSON format files. We will use the file open() method to open this file and then load and store it in a variable using the json.load() method. After we have successfully loaded the necessary JSON data in our code, we will extract the required information from this data using standard Python techniques.

Note – Please follow the comments in the given snippet to understand how the data has been parsed.

Code:

import json

with open('data.json') as f:
    data = json.load(f)
# viewing the extracted JSON data
print(data)
print()
# Extracting the first and lastname fields from data
print('Name: ', data['firstName']+" "+data['lastName'])
# Extracting the address from data['address'] fields
for value in data['address']:
    print(value, ":", data['address'][value])
# Extracting the phone number field from data['phoneNumbers']
for num in data['phoneNumbers']:
    print("Phone Number:", num['number'])

Output:

{'firstName': 'Joe', 'lastName': 'Jackson', 'gender': 'male', 'age': 28, 'address': {'streetAddress': '101', 'city': 'San Diego', 'state': 'CA'}, 'phoneNumbers': [{'type': 'home', 'number': '7349282382'}]}

Name:  Joe Jackson
streetAddress : 101
city : San Diego
state : CA
Phone Number: 7349282382

Method 2: Using json.loads()

Well, we had to load a file in the above case. What if there’s a JSON data that is embedded in the script itself? How do you use it to parse the necessary information? Let’s find out.

We will first have a look at the snippet and then go through the explanation to understand what’s happening in it.

import json

data = """{
   "firstName": "Joe",
   "lastName": "Jackson",
   "gender": "male",
   "age": 28,
   "address": {
       "streetAddress": "101",
       "city": "San Diego",
       "state": "CA"
   },
   "phoneNumbers": [
       { "type": "home", "number": "7349282382" }
   ]
}"""
# converting JSON string to Python Object
data_obj = json.loads(data)
# viewing the extracted JSON data
print(data_obj)
print()
# Extracting the first and lastname fields from data
print('Name: ', data_obj['firstName']+" "+data_obj['lastName'])
# Extracting the address from data['address'] fields
for value in data_obj['address']:
    print(value, ":", data_obj['address'][value])
# Extracting the phone number field from data['phoneNumbers']
for num in data_obj['phoneNumbers']:
    print("Phone Number:", num['number'])

Output:

{'firstName': 'Joe', 'lastName': 'Jackson', 'gender': 'male', 'age': 28, 'address': {'streetAddress': '101', 'city': 'San Diego', 'state': 'CA'}, 'phoneNumbers': [{'type': 'home', 'number': '7349282382'}]}

Name:  Joe Jackson
streetAddress : 101
city : San Diego
state : CA
Phone Number: 7349282382

Explanation: json.loads() is a method that allows us to convert the given json string into a Python object which can then be used to parse the required data.

Method 2: Using urllib and json

An approach to get the json object from a given url is to use a couple of libraries, known as urllib and json. We have already used the json library previously. Now we will see the usage of urllib to pull data from a URL.

import json
import urllib.request

my_url = 'https://gorest.co.in/public/v2/users'
x = 0
with urllib.request.urlopen(my_url) as url:
    data = json.loads(url.read().decode())
    # printing only the first dictionary
    print(data[0])
print()

x = 0
for d in data:
    # displaying only five data
    if x < 5:
        print(d['id'], " ", d['name'])
        x += 1

Output:

{'id': 2706, 'name': 'Ahalya Devar', 'email': 'ahalya_devar@jacobi.info', 'gender': 'male', 'status': 'inactive'}

2706   Ahalya Devar
2700   Chandini Malik II
2699   Atmananda Guha
2696   Deepan Iyengar
2694   Anshula Sinha

Explanation: We imported the modules urllib.request and json. We then went on to send a request and open a connection to the server in a with environment. The loads method then helped us to read the json data and convert it into a Python object.

Method 3: Read JSON with Pandas

Extracting JSON object from a given URL can be a cakewalk if you use the Pandas library. Use the pandas.read_url('url') method that will convert the JSON data into a pandas DataFrame which can be then be used for further processing.

Example:

import pandas as pd

my_url = 'https://gorest.co.in/public/v2/users'
# reading the JSON data from the URL and converting the json to dataframe
data = pd.read_json(my_url)
print()
# extracting the first 5 names from the dataframe
print(data['name'].head())

Output:

0         Ahalya Devar
1    Chandini Malik II
2       Atmananda Guha
3       Deepan Iyengar
4        Anshula Sinha
Name: name, dtype: object

Conclusion

We have come to the end of this discussion and we have learned numerous ways of parsing JSON data in Python. We also saw the correct format of JSON data that can be properly read by Python.

Here are some of the highly recommended and related articles that you should consider reading:

Please subscribe and stay tuned for more interesting solutions and discussions.


Learn Pandas the Fun Way by Solving Code Puzzles

If you want to boost your Pandas skills, consider checking out my puzzle-based learning book Coffee Break Pandas (Amazon Link).

Coffee Break Pandas Book

It contains 74 hand-crafted Pandas puzzles including explanations. By solving each puzzle, you’ll get a score representing your skill level in Pandas. Can you become a Pandas Grandmaster?

Coffee Break Pandas offers a fun-based approach to data science mastery—and a truly gamified learning experience.