Python Read Text File into List of Dictionaries

5/5 - (1 vote)

When working with textual data files in Python, it is quite common to encounter a file that contains a list of dictionaries, stored in a format which is nearly identical to JSON, but not quite. This could happen due to the way the data was persisted, often using Python’s str() function, as opposed to a proper serialization method.

In this article, we’ll look at multiple ways you can read such data back into a Python list of dictionaries.

Problem Formulation

πŸ’‘ Problem Formulation: Given a text file with content that represents a list of dictionaries – albeit not in strict JSON format – how can we read the content of this file and convert it back to actual Python data structures? The challenge is the safe conversion of this string representation back to usable Python types without executing potentially unsafe code.

Let’s say there’s a text file named data.txt that contains the following content:

[{'name': 'Alice', 'age': 30, 'city': 'New York'}, {'name': 'Bob', 'age': 25, 'city': 'Los Angeles'}]

This file is meant to hold a list of dictionaries, each representing a person with their name, age, and city. The problem is to read this file back into Python in such a way that you regain a list of dictionaries that you can work with programmatically.

Method 1: Using ast.literal_eval

ast.literal_eval safely evaluates a string containing Python literal expressions, converting it to corresponding Python data types. This is considered safe because it only considers literal structures like strings, numbers, tuples, lists, dictionaries, and so on, and rejects any complex or potentially dangerous Python code.

import ast

with open('file.txt') as f:
    data = ast.literal_eval(f.read())

In the code snippet above, we open the file 'file.txt' for reading, use the read() method to return its content as a string, then pass this string to ast.literal_eval. The result, data, is the Python data structure equivalent of the string representation within the file.

Method 2: Using json.loads with Replacing

If your data is mostly JSON, with only a few Python-specific modifications (e.g., single quotes instead of double quotes), you might consider string replacement to fix these issues and then use the json module.

import json

with open('file.txt') as f:
    content = f.read()
    corrected_content = content.replace("'", '"')
    data = json.loads(corrected_content)

The replace() method is called on the file content to substitute single quotes for double quotes, making the string JSON-compliant. Then, json.loads is used to load the string into a data structure.

πŸ‘‰ How to Read a Dictionary from a File

Method 3: Using pickle

If the data was originally serialized using Python’s pickle module, then you would use pickle to deserialize it. Pickle can serialize and deserialize complex Python objects, but be warned, it can execute arbitrary code and should not be used on untrusted data.

import pickle

with open('file.pkl', 'rb') as f:
    data = pickle.load(f)

Here we assume the file was named with a .pkl extension, indicating pickle serialization. The file is opened in binary read mode ('rb') and pickle.load() is used to deserialize the contents directly into a Python object.

πŸ‘‰ How to Serialize a Python Dict into a String and Back?

Method 4: Using eval (Not Recommended)

While using Python’s built-in eval() function can convert a string representation of a list of dictionaries back to Python objects, it is generally discouraged due to security risks. eval will execute any included code, which can be a significant security concern if the data source is not entirely trustworthy.

# WARNING: Only use this method if you completely trust the data source
with open('file.txt') as f:
    data = eval(f.read())

The eval function takes a string and evaluates it as Python expression. However, this method can be dangerous and should only be used with completely trusted data sources.

Python eval() -- How to Dynamically Evaluate a Code Expression in Python

πŸ‘‰ Python eval()

Summary/Discussion

Converting a string-represented list of dictionaries back into actual Python data structures can be a common task when dealing with files written using the str(list) approach.

The safest and most commonly recommended method is to use ast.literal_eval, though the json module might also be helpful if the data is close to valid JSON.

The pickle module works for data originally serialized in this format, but like eval, can be unsafe if the data source is not trusted.

πŸ‘‰ Python Read Text File into List of Strings