5 Best Ways to Perform a Deep Copy of a Python Dictionary

πŸ’‘ Problem Formulation: When working with dictionaries in Python, you might encounter situations where you need to make a completely independent copy of an existing dictionary. Deep copying is crucial when you want to manipulate the copied dictionary without altering the original. For example, given a dictionary {'apple': 1, 'banana': {'weight': 30, 'price': 50}}, you need a new dictionary with the same content that can be changed without affecting the original.

Method 1: Using the copy Module’s deepcopy

One robust method for creating a deep copy of a dictionary in Python is by using the deepcopy function from the standard library’s copy module. It creates a new dictionary with recursively copied values from the original, ensuring that complex, nested structures are duplicated rather than referenced.

Here’s an example:

import copy
original_dict = {'apple': 1, 'banana': {'weight': 30, 'price': 50}}
deep_copied_dict = copy.deepcopy(original_dict)

Output:

{'apple': 1, 'banana': {'weight': 30, 'price': 50}}

The deepcopy function has traversed the original dictionary, including the nested dictionary within the ‘banana’ key, and created a completely separate copy that can be modified without affecting the original.

Method 2: Using JSON Serialization/Deserialization

Another method to deep copy a dictionary is by serializing the original dictionary to JSON and then deserializing it back to a Python object. This technique uses the json module and works well for dictionaries containing data types that are JSON serializable.

Here’s an example:

import json
original_dict = {'apple': 1, 'banana': {'weight': 30, 'price': 50}}
json_repr = json.dumps(original_dict)
deep_copied_dict = json.loads(json_repr)

Output:

{'apple': 1, 'banana': {'weight': 30, 'price': 50}}

The serialization and subsequent deserialization create a new dictionary with the same content. However, because JSON supports only a subset of Python data types, this method isn’t suitable for dictionaries containing non-serializable objects.

Method 3: Using a Recursive Function

For those who need more control over how the deep copy is performed, writing a custom recursive function to duplicate the dictionary can be quite useful. This method provides flexibility with how the copy handles various data types and allows for custom behaviors.

Here’s an example:

def deep_copy_dict(d):
    copied_dict = {}
    for k, v in d.items():
        if isinstance(v, dict):
            copied_dict[k] = deep_copy_dict(v)
        else:
            copied_dict[k] = v
    return copied_dict

original_dict = {'apple': 1, 'banana': {'weight': 30, 'price': 50}}
deep_copied_dict = deep_copy_dict(original_dict)

Output:

{'apple': 1, 'banana': {'weight': 30, 'price': 50}}

The recursive function deep_copy_dict checks whether the value is a dictionary and copies it accordingly by calling itself, otherwise it simply copies the value. This provides a deep copy but requires manual handling of each potential data type within the dictionary.

Method 4: Using a Comprehension and the copy Method

You can also deep copy a dictionary by combining dictionary comprehensions with the copy method that dictionaries have. While this method is straightforward, it is only suitable for dictionaries one level deep, as nested dictionaries will not be truly deep copied.

Here’s an example:

original_dict = {'apple': 1, 'banana': {'weight': 30, 'price': 50}}
shallow_copied_dict = {k: v.copy() if isinstance(v, dict) else v for k, v in original_dict.items()}

Output:

{'apple': 1, 'banana': {'weight': 30, 'price': 50}}

This code creates a shallow copy of the dictionary where the top level is copied but any nested dictionaries are only shallow copied. This means that if you modify a nested dictionary, it will change in both the original and the copied dictionaries.

Bonus One-Liner Method 5: Using a Pickle Roundtrip

Finally, a less conventional but still effective one-liner method involves using the pickle module. By serializing the dictionary to a pickle byte stream and then immediately deserializing it, you can create a deep copy. Note that this method can handle many Python-specific data types that JSON cannot.

Here’s an example:

import pickle
original_dict = {'apple': 1, 'banana': {'weight': 30, 'price': 50}}
deep_copied_dict = pickle.loads(pickle.dumps(original_dict))

Output:

{'apple': 1, 'banana': {'weight': 30, 'price': 50}}

The pickle module’s dumps and loads functions have been used back-to-back to serialize the original dictionary to a byte stream, then deserialize that stream to a new object, thus creating a deep copy.

Summary/Discussion

  • Method 1: copy.deepcopy. Strengths: Simple, reliable, and handles complex nested structures. Weaknesses: Can be slow for very large structures.
  • Method 2: JSON Serialization/Deserialization. Strengths: Simple and easily readable code. Weaknesses: Only works with JSON-serializable data types.
  • Method 3: Custom Recursive Function. Strengths: Highly customizable and ideal for intricate copying scenarios. Weaknesses: Requires manual implementation and error handling for each data type.
  • Method 4: Comprehension and copy Method. Strengths: Quick and easy for non-nested dictionaries. Weaknesses: Only provides a shallow copy for nested dictionaries.
  • Method 5: Pickle Roundtrip. Strengths: One-liner and handles many data types. Weaknesses: Pickle serialization can be unsafe if data is untrustworthy, and may be slower than other methods.