Problem Formulation and Solution Overview
Jeff Noble, a Marine Archeologist, has put together a team to search for shipwrecks each month. This month, they will search for the Wreck of the Edmund Fitzgerald: a Great Lakes freighter lost in 1975. The details for this shipwreck were provided to him in the following formats:
- Text File
- JSON File
- Pickle File
Jeff would like you to write a script that reads in one of the above file types and output to the terminal in columns.
File Contents (saved as ef.txt
, ef.json
, and ef.pickle
)
{"Name": "SS Edmund Fitzgerald", "Nickname": "Titanic of the Great Lakes", "Ordered": "bruary 1, 1957", "Owner": "Northwest Mutual", |
π¬ Question: How would we read in a Dictionary file and determine the best method?
We can accomplish this task by one of the following options:
- Method 1: Use
to read in a text file.ast.literal_eval()
- Method 2: Use
json.load()
to read in a JSON file. - Method 3: Use
pickle.load()
to read in a pickle file. - Method 4: Use
json.loads()
to read to a DataFrame.
- The Pandas library enables access to/from a DataFrame.
To install this library, navigate to an IDE terminal. At the command prompt ($
), execute the code below. For the terminal used in this example, the command prompt is a dollar sign ($
). Your terminal prompt may be different.
$ pip install pandas
Hit the <Enter>
key on the keyboard to start the installation process.
If the installation was successful, a message displays in the terminal indicating the same.
Feel free to view the PyCharm installation guide for the required libraries.
Add the following code to the top of each code snippet. This snippet will allow the code in this article to run error-free.
import pandas from pandas import json_normalize import ast import json import pickle
π‘ Note: The additional libraries indicated above do not require installation as they come built-in to Python.
Method 1: Use ast.literal_eval()
The ast.literal_eval()
function safely reads in and parses a string from a Python container, such as a Dictionary.
with open('ef.txt', 'r') as fp: data = fp.read() details = ast.literal_eval(data) for key, value in details.items(): print ("{:<20} {:<10} ".format(key, value))
- The existing text file is opened, and a file pointer object
fp
is created.- The contents of the file are read into
data
. - Then
ast.literal_eval()
reads thedata
variable and saves todetails
.
- The contents of the file are read into
- A
for
loop navigates through the dictionary keys and values.- Each loop formats the output into columns and outputs to the terminal.
π‘ Note: In the past, there was talk that this function caused security risks. However, since this code only parses, it is not a concern.
Output (snippet)
Name | SS Edmund Fitzgerald |
Nickname | Titanic of the Great Lakes |
Ordered | February 1, 1957 |
Owner | Northwest Mutual |
Captain | Ernest M. McSorley |
Type | Great Lakes Freighter |
Cargo | Iron Ore |
Method 2: Use JSON
The JSON file structure is based on key:value pairs: just like a Python Dictionary. This format is commonly used to transmit data and is favored for its ability to be read by many coding languages. Its low overhead gives this option the thumbs up!
with open('ef.json', 'r') as fp: details = json.load(fp) for key, value in details.items(): print ("{:<20} {:<10} ".format(key, value))
- The existing JSON file is opened, and a file pointer object
fp
is created.- The contents of the file are read using
json.load()
and saved todetails
.
- The contents of the file are read using
- A For loop navigates through the Dictionary keys and values.
- Each loop formats the output into columns and outputs to the terminal.
Output (snippet)
Name | SS Edmund Fitzgerald |
Nickname | Titanic of the Great Lakes |
Ordered | February 1, 1957 |
Owner | Northwest Mutual |
Captain | Ernest M. McSorley |
Type | Great Lakes Freighter |
Cargo | Iron Ore |
Method 3: Use Pickle
A good place to use a Pickle file is when you have sensitive data or when you need to keep a program status across sessions. The data is serialized and stored as a binary file.
with open('ef.pickle', 'rb') as fp: details = pickle.load(fp) for key, value in details.items(): print ("{:<20} {:<10} ".format(key, value))
- The existing Pickle file is opened, and a file pointer object
fp
is created.- The contents of the file are read using
pickle.load()
and saved todetails
.
- The contents of the file are read using
- A For loop navigates through the Dictionary keys and values.
- Each loop formats the output into columns and outputs to the terminal.
Output (snippet)
Name | SS Edmund Fitzgerald |
Nickname | Titanic of the Great Lakes |
Ordered | February 1, 1957 |
Owner | Northwest Mutual |
Captain | Ernest M. McSorley |
Type | Great Lakes Freighter |
Cargo | Iron Ore |
Method 4: Read to a DataFrame
If you prefer working with DataFrames, converting the file to a DataFrame may be ideal. With no additional formatting required, the output is set to display, by default, in a column format.
with open('ef.json', 'rb') as fp: details = fp.read() df = json_normalize(json.loads(details)).T print(df)
- The existing JSON file is opened, and a file pointer object
fp
is created.- The file contents are read in and saved to
details
. - The
json.loads()
function passes details and saves tojson_data
. - The data is normalized and converted to a DataFrame (
df
) in portrait orientation.
- The file contents are read in and saved to
- The output is sent to the terminal.
Output (snippet)
Name | SS Edmund Fitzgerald |
Nickname | Titanic of the Great Lakes |
Ordered | February 1, 1957 |
Owner | Northwest Mutual |
Captain | Ernest M. McSorley |
Type | Great Lakes Freighter |
Cargo | Iron Ore |
Summary
We selected Method 2 (JSON) as the best option upon reviewing the above methods. JSON is faster as it needs fewer bytes for transmittal and less processing time.
Problem Solved! Happy Coding!