Problem Formulation and Solution Overview
Antoine, a Curator from the Smithsonian Museum, is taking their Egyptian Collection on the road. Antoine has received a Schedule as a JSON file. He needs your assistance to view the contents as a DataFrame.
π¬ Question: How would we write Python code to read a JSON string into a DataFrame?
We can accomplish this task by one of the following options:
- Method 1: Use
read_json()
to convert a JSON string to a DataFrame - Method 2: Use
json_normalize()
andjson.loads()
to convert a JSON string to a DataFrame - Method 3: Use
read_json()
to convert JSON file to a DataFrame
- The Pandas library enables access to/from a DataFrame.
To install this library, navigate to an IDE terminal. At the command prompt ($
), execute the code below. For the terminal used in this example, the command prompt is a dollar sign ($
). Your terminal prompt may be different.
$ pip install pandas
Hit the <Enter>
key on the keyboard to start the installation process.
If the installation was successful, a message displays in the terminal indicating the same.
Feel free to view the PyCharm installation guide for the required library.
Add the following code to the top of each code snippet. This snippet will allow the code in this article to run error-free.
import pandas as pd import json
Method 1: Use read_json() to convert a JSON string to a DataFrame
This example reads in a List of Dictionaries using Panda’s read_json()
function. It accepts a JSON string and converts it to a DataFrame.
event_data = ''' [ { "ID": "1", "Museum": "Royal Ontario Museum", "Country": "Canada", "Month": "June"}, { "ID": "2", "Museum": "The Louvre", "Country": "France", "Month": "July" }, { "ID": "2", "Museum": "German Historical Museum", "Country": "Germany", "Month": "Aug." }, { "ID": "4", "Museum": "Amsterdam Museum", "Country": "Netherlands", "Month": "Sept." }]] ''' df = pd.read_json(event_data, orient='columns') print(df)
This code defines a JSON string, then passes it to read_json()
as an argument. Then, orient
is set to columns
to let the function know what data type to expect. Finally, the output from the resultant DataFrame df
is sent to the terminal.
Output
ID | Museum | Country | Month | |
0 | 1 | Royal Ontario Museum | Canada | June |
1 | 2 | The Louvre | France | July |
2 | 3 | German Historical Museum | Germany | Aug. |
3 | 4 | Amsterdam Museum | Netherlands | Sept. |
π‘ Note: The orient
argument has several available options. We suggest you take the time to familiarize yourself.
Method 2: Use json_normalize() and json.loads() to convert a JSON string to a DataFrame
This example reads in a List of Dictionaries using the json.loads()
and json_normalize()
approach.
event_data = ''' [ { "ID": "1", "Museum": "Royal Ontario Museum", "Country": "Canada", "Month": "June"}, { "ID": "2", "Museum": "The Louvre", "Country": "France", "Month": "July" }, { "ID": "2", "Museum": "German Historical Museum", "Country": "Germany", "Month": "Aug." }, { "ID": "4", "Museum": "Amsterdam Museum", "Country": "Netherlands", "Month": "Sept." }] ''' df = pd.json_normalize(json.loads(event_data)) print(df)
This code defines a JSON string, then passes it to read_json()
as an argument. Then, json_normalize()
takes this data, flattens it, and converts it to a DataFrame. Finally, the output from the resultant DataFrame df
is sent to the terminal.
Output
ID | Museum | Country | Month | |
0 | 1 | Royal Ontario Museum | Canada | June |
1 | 2 | The Louvre | France | July |
2 | 3 | German Historical Museum | Germany | Aug. |
3 | 4 | Amsterdam Museum | Netherlands | Sept. |
Method 3: Use read_json() to convert to a DataFrame
This example uses read_json()
to read in a JSON file consisting of a Dictionary of Dictionaries.
File Contents (schedule.json
)
{"ID":{"1":"1","2":"2","3":"3","4":"4"},"Museum":{"1":"Royal Ontario Museum","2":"The Louvre","3":"German Historical Museum","4":"Amsterdam Museum"},"Place":{"1":"Canada","2":"France","3":"Germany","4":"Netherlands"}, "Date": {"1": "Jun", "2": "Jul", "3": "Aug", "4": "Sep"}}
df = pd.read_json('schedule.json') print(df)
This code reads in a Dictionary of Dictionaries from a JSON file saved in the current working directory. The output is converted to a DataFrame and output to the terminal.
Output
ID | Museum | Country | Month | |
0 | 1 | Royal Ontario Museum | Canada | June |
1 | 2 | The Louvre | France | July |
2 | 3 | German Historical Museum | Germany | Aug. |
3 | 4 | Amsterdam Museum | Netherlands | Sept. |
Summary
These three (3) methods to convert JSON to a DataFrame should give you enough information to select the best one for your coding requirements.
Good Luck & Happy Coding!