How to Convert JSON to Pandas DataFrame

5/5 - (1 vote)

Problem Formulation and Solution Overview

In this article, you’ll learn how to read a JSON string and convert it to a Pandas DataFrame in Python.

To make it more fun, we have the following running scenario:

Antoine, a Curator from the Smithsonian Museum, is taking their Egyptian Collection on the road. Antoine has received a Schedule as a JSON file. He needs your assistance to view the contents as a DataFrame.

💬 Question: How would we write Python code to read a JSON string into a DataFrame?

We can accomplish this task by one of the following options:


Before our code executes successfully, one (1) new library will require installation.

  • The Pandas library enables access to/from a DataFrame.

To install this library, navigate to an IDE terminal. At the command prompt ($), execute the code below. For the terminal used in this example, the command prompt is a dollar sign ($). Your terminal prompt may be different.

$ pip install pandas

Hit the <Enter> key on the keyboard to start the installation process.

If the installation was successful, a message displays in the terminal indicating the same.


Feel free to view the PyCharm installation guide for the required library.

Add the following code to the top of each code snippet. This snippet will allow the code in this article to run error-free.

import pandas as pd 
import json

Method 1: Use read_json() to convert a JSON string to a DataFrame

This example reads in a List of Dictionaries using Panda’s
read_json() function. It accepts a JSON string and converts it to a DataFrame.

event_data = '''
[   { "ID": "1", "Museum": "Royal Ontario Museum",     
      "Country": "Canada", "Month": "June"},
    { "ID": "2", "Museum": "The Louvre",
      "Country": "France", "Month": "July" },
    { "ID": "2", "Museum": "German Historical Museum", 
      "Country": "Germany", "Month": "Aug." },
    { "ID": "4", "Museum": "Amsterdam Museum",
      "Country": "Netherlands", "Month": "Sept." }]]
'''
df = pd.read_json(event_data, orient='columns')
print(df)
How To Read A JSON File With Python?

This code defines a JSON string, then passes it to read_json() as an argument. Then, orient is set to columns to let the function know what data type to expect. Finally, the output from the resultant DataFrame df is sent to the terminal.

Output

IDMuseum Country Month
01Royal Ontario MuseumCanada June
12The LouvreFrance July
23German Historical MuseumGermany Aug.
34Amsterdam MuseumNetherlands Sept.

💡 Note: The orient argument has several available options. We suggest you take the time to familiarize yourself.


Method 2: Use json_normalize() and json.loads() to convert a JSON string to a DataFrame

This example reads in a List of Dictionaries using the json.loads() and json_normalize() approach.

event_data = '''
[   { "ID": "1", "Museum": "Royal Ontario Museum",     
      "Country": "Canada", "Month": "June"},
    { "ID": "2", "Museum": "The Louvre",
      "Country": "France", "Month": "July" },
    { "ID": "2", "Museum": "German Historical Museum", 
      "Country": "Germany", "Month": "Aug." },
    { "ID": "4", "Museum": "Amsterdam Museum",
      "Country": "Netherlands", "Month": "Sept." }]
'''
df = pd.json_normalize(json.loads(event_data)) 
print(df)

This code defines a JSON string, then passes it to read_json() as an argument. Then, json_normalize() takes this data, flattens it, and converts it to a DataFrame. Finally, the output from the resultant DataFrame df is sent to the terminal.

Output

IDMuseum Country Month
01Royal Ontario MuseumCanada June
12The LouvreFrance July
23German Historical MuseumGermany Aug.
34Amsterdam MuseumNetherlands Sept.

Method 3: Use read_json() to convert to a DataFrame

This example uses read_json() to read in a JSON file consisting of a Dictionary of Dictionaries.

File Contents (schedule.json)

{"ID":{"1":"1","2":"2","3":"3","4":"4"},"Museum":{"1":"Royal Ontario Museum","2":"The Louvre","3":"German Historical Museum","4":"Amsterdam Museum"},"Place":{"1":"Canada","2":"France","3":"Germany","4":"Netherlands"}, "Date": {"1": "Jun", "2": "Jul", "3": "Aug", "4": "Sep"}}
df = pd.read_json('schedule.json')
print(df)

This code reads in a Dictionary of Dictionaries from a JSON file saved in the current working directory. The output is converted to a DataFrame and output to the terminal.

Output

IDMuseum Country Month
01Royal Ontario MuseumCanada June
12The LouvreFrance July
23German Historical MuseumGermany Aug.
34Amsterdam MuseumNetherlands Sept.

Summary

These three (3) methods to convert JSON to a DataFrame should give you enough information to select the best one for your coding requirements.

Good Luck & Happy Coding!