Problem Formulation
Do you have JSON data that you need to parse using a Python script? Let’s have a look at this JSON data –
{ "maps": [ { "id": "blabla", "iscategorical": "0" }, { "id": "blabla", "iscategorical": "0" } ], "masks": [ "id": "valore" ], "om_points": "value", "parameters": [ "id": "valore" ] }
But when you try to parse this file in your script, you get an exception. Frustrating! Isn’t it? Don’t worry. Most probably you have no errors in your script. The error is the JSON data itself.
So, in this tutorial we will be solving two problems –
- Why can’t Python parse this JSON data? [the one shown above]
- How to parse JSON data in Python?
Let’s answer the questions one by one. Please follow along to unearth the answers.
Why can’t Python parse the JSON data?
The error is probably not within your script. It is the JSON data that has been provided in the wrong format. You have square brackets,i.e., [] in line 12 where you should actually have {} braces.
NOTE:
[]
are used to denote JSON arrays.{}
are used to denote JSON objects.
Now use the following code to use this JSON data.
import json from pprint import pprint with open('data.json') as f: data = json.load(f) pprint(data)
Output:
{'maps': [{'id': 'blabla', 'iscategorical': '0'},
{'id': 'blabla', 'iscategorical': '0'}],
'masks': {'id': 'valore'},
'om_points': 'value',
'parameters': {'id': 'valore'}}
We have now dealt with our first problem. It is now time to deal with the second question. So, what if the JSON data is correct but you have no clue how to import and use it in your script. Let’s find out.
Reading a JSON File
Method 1: Using json.load()
Consider that we have the following JSON file in our project folder –
{ "firstName": "Joe", "lastName": "Jackson", "gender": "male", "age": 28, "address": { "streetAddress": "101", "city": "San Diego", "state": "CA" }, "phoneNumbers": [ { "type": "home", "number": "7349282382" } ] }
Approach: We can use the Python module named json
that is used to encode and decode JSON format files. We will use the file open()
method to open this file and then load and store it in a variable using the json.load()
method. After we have successfully loaded the necessary JSON data in our code, we will extract the required information from this data using standard Python techniques.
Note – Please follow the comments in the given snippet to understand how the data has been parsed.
Code:
import json with open('data.json') as f: data = json.load(f) # viewing the extracted JSON data print(data) print() # Extracting the first and lastname fields from data print('Name: ', data['firstName']+" "+data['lastName']) # Extracting the address from data['address'] fields for value in data['address']: print(value, ":", data['address'][value]) # Extracting the phone number field from data['phoneNumbers'] for num in data['phoneNumbers']: print("Phone Number:", num['number'])
Output:
{'firstName': 'Joe', 'lastName': 'Jackson', 'gender': 'male', 'age': 28, 'address': {'streetAddress': '101', 'city': 'San Diego', 'state': 'CA'}, 'phoneNumbers': [{'type': 'home', 'number': '7349282382'}]}
Name: Joe Jackson
streetAddress : 101
city : San Diego
state : CA
Phone Number: 7349282382
Method 2: Using json.loads()
Well, we had to load a file in the above case. What if there’s a JSON data that is embedded in the script itself? How do you use it to parse the necessary information? Let’s find out.
We will first have a look at the snippet and then go through the explanation to understand what’s happening in it.
import json data = """{ "firstName": "Joe", "lastName": "Jackson", "gender": "male", "age": 28, "address": { "streetAddress": "101", "city": "San Diego", "state": "CA" }, "phoneNumbers": [ { "type": "home", "number": "7349282382" } ] }""" # converting JSON string to Python Object data_obj = json.loads(data) # viewing the extracted JSON data print(data_obj) print() # Extracting the first and lastname fields from data print('Name: ', data_obj['firstName']+" "+data_obj['lastName']) # Extracting the address from data['address'] fields for value in data_obj['address']: print(value, ":", data_obj['address'][value]) # Extracting the phone number field from data['phoneNumbers'] for num in data_obj['phoneNumbers']: print("Phone Number:", num['number'])
Output:
{'firstName': 'Joe', 'lastName': 'Jackson', 'gender': 'male', 'age': 28, 'address': {'streetAddress': '101', 'city': 'San Diego', 'state': 'CA'}, 'phoneNumbers': [{'type': 'home', 'number': '7349282382'}]}
Name: Joe Jackson
streetAddress : 101
city : San Diego
state : CA
Phone Number: 7349282382
Explanation: json.loads()
is a method that allows us to convert the given json string into a Python object which can then be used to parse the required data.
Method 2: Using urllib and json
An approach to get the json object from a given url is to use a couple of libraries, known as urllib and json. We have already used the json
library previously. Now we will see the usage of urllib
to pull data from a URL.
import json import urllib.request my_url = 'https://gorest.co.in/public/v2/users' x = 0 with urllib.request.urlopen(my_url) as url: data = json.loads(url.read().decode()) # printing only the first dictionary print(data[0]) print() x = 0 for d in data: # displaying only five data if x < 5: print(d['id'], " ", d['name']) x += 1
Output:
{'id': 2706, 'name': 'Ahalya Devar', 'email': 'ahalya_devar@jacobi.info', 'gender': 'male', 'status': 'inactive'} 2706 Ahalya Devar 2700 Chandini Malik II 2699 Atmananda Guha 2696 Deepan Iyengar 2694 Anshula Sinha
Explanation: We imported the modules urllib.request
and json
. We then went on to send a request and open a connection to the server in a with environment
. The loads
method then helped us to read the json data and convert it into a Python object.
Method 3: Read JSON with Pandas
Extracting JSON object from a given URL can be a cakewalk if you use the Pandas library. Use the pandas.read_url('url')
method that will convert the JSON data into a pandas DataFrame which can be then be used for further processing.
Example:
import pandas as pd my_url = 'https://gorest.co.in/public/v2/users' # reading the JSON data from the URL and converting the json to dataframe data = pd.read_json(my_url) print() # extracting the first 5 names from the dataframe print(data['name'].head())
Output:
0 Ahalya Devar
1 Chandini Malik II
2 Atmananda Guha
3 Deepan Iyengar
4 Anshula Sinha
Name: name, dtype: object
Conclusion
We have come to the end of this discussion and we have learned numerous ways of parsing JSON data in Python. We also saw the correct format of JSON data that can be properly read by Python.
Here are some of the highly recommended and related articles that you should consider reading:
- How to Parse JSON in a Python One-Liner?
- How To Read A JSON File With Python
- How to Get JSON from URL in Python?
- Reading and Writing JSON with Pandas
Please subscribe and stay tuned for more interesting solutions and discussions.
Learn Pandas the Fun Way by Solving Code Puzzles
If you want to boost your Pandas skills, consider checking out my puzzle-based learning book Coffee Break Pandas (Amazon Link).
It contains 74 hand-crafted Pandas puzzles including explanations. By solving each puzzle, you’ll get a score representing your skill level in Pandas. Can you become a Pandas Grandmaster?
Coffee Break Pandas offers a fun-based approach to data science mastery—and a truly gamified learning experience.