5 Best Ways to Merge Dictionaries with Duplicate Keys in Python

πŸ’‘ Problem Formulation: When working with lists of dictionaries in Python, a common challenge is handling the merging of these dictionaries when duplicate keys are present. Ideally, we might want to combine these dictionaries in such a way that the value associated with each duplicate key is handled according to a predefined logic; for instance, summing up the values or keeping the last seen value. Consider the input as a list of dictionaries, such as [{'a': 1, 'b': 2}, {'b': 3, 'c': 4}], and the desired output as a single dictionary that merges these, like {'a': 1, 'b': 5, 'c': 4} showing the sum of values for the key ‘b’.

Method 1: Use the update() method and a for-loop

This method iterates through the list of dictionaries and sequentially updates a new dictionary using the update() method. When encountering duplicate keys, it will overwrite the existing values with the most recent ones. This is basic and straightforward but offers control over the merging process for simple use cases.

Here’s an example:

result = {}
dict_list = [{'a': 1, 'b': 2}, {'b': 3, 'c': 4}]

for d in dict_list:
    result.update(d)

print(result)

The output will be:

{'a': 1, 'b': 3, 'c': 4}

In this snippet, we loop over each dictionary in the list and continuously update our result dictionary. The value for the key ‘b’ in the result is 3, because the second dictionary overwrites the value of ‘b’ from the first dictionary. This approach is simple but replaces previous values for duplicate keys rather than merging them.

Method 2: Use Dictionary Comprehension

Dictionary comprehension can be used to merge dictionaries in a list. This method applies a merging strategy for each key and allows for customization, such as summing the values of duplicate keys. It’s more flexible but might require more code for complex merging strategies.

Here’s an example:

dict_list = [{'a': 1, 'b': 2}, {'b': 3, 'c': 4}]
result = {k: sum(d.get(k, 0) for d in dict_list) for k in set().union(*dict_list)}

print(result)

The output will be:

{'a': 1, 'b': 5, 'c': 4}

The code uses dictionary comprehension to iterate over each unique key (obtained using set union), and for each key it calculates the sum of its values from all dictionaries in the list. This gives us a merged dictionary with summed values for the duplicate keys.

Method 3: Using collections.defaultdict

The collections.defaultdict class can handle default values for missing keys, making it suitable for creating accumulators or summing values. This method is great for more complex merging where values should be combined rather than overwritten.

Here’s an example:

from collections import defaultdict

dict_list = [{'a': 1, 'b': 2}, {'b': 3, 'c': 4}]
result = defaultdict(int)

for d in dict_list:
    for key, value in d.items():
        result[key] += value

print(dict(result))

The output will be:

{'a': 1, 'b': 5, 'c': 4}

By using a defaultdict with integer default values, each value is automatically initialized to 0 and then incremented by the corresponding value in each dictionary. Finally, converting back to a standard dictionary prints out the merged results with summed values.

Method 4: Using the ChainMap

Python’s collections.ChainMap class groups multiple dictionaries into a single view that can be used to look up keys. While it doesn’t merge them into a single dictionary, it’s useful for scenarios where a merged view is needed without manipulation of the underlying dictionaries.

Here’s an example:

from collections import ChainMap

dict_list = [{'a': 1, 'b': 2}, {'b': 3, 'c': 4}]
merged = ChainMap(*dict_list)

for key in merged:
    print(f"{key}: {merged[key]}")

The output will mimic the behavior of an actually merged dictionary:

a: 1
b: 3
c: 4

Even though ChainMap doesn’t actually merge the dictionaries, it allows for easy access as if they were merged. It returns the first occurrence of the key from the list, so in this context, the value of the key ‘b’ would be 2, as it finds it first in the first dictionary.

Bonus One-Liner Method 5: Using a Dictionary Comprehension with sum()

For a compact one-liner solution that sums values on duplicate keys, this dictionary comprehension uses the sum() function. It’s best suited for simple cases with a single operation, like summing or finding the maximum.

Here’s an example:

dict_list = [{'a': 1, 'b': 2}, {'b': 3, 'c': 4}]
result = {key: sum(d[key] for d in dict_list if key in d) for key in set().union(*dict_list)}

print(result)

The output will be:

{'a': 1, 'b': 5, 'c': 4}

This concise one-liner iterates over each unique key and computes the sum of its values for all dictionaries in the list where the key is present. The use of set union ensures that each key is represented only once.

Summary/Discussion

  • Method 1: Update Method. Simple to implement. Overwrites duplicate keys; not suitable for value aggregation.
  • Method 2: Dictionary Comprehension. Highly flexible and customizable. Can become complex for intricate merging strategies.
  • Method 3: collections.defaultdict. Ideal for complex value aggregation. Requires importing the collections module.
  • Method 4: ChainMap. Provides a merged view without actual data merging. Doesn’t aggregate values, just provides access to the first value found.
  • Method 5: One-Liner Comprehension with sum(). Quick and concise for summing values. Limited to simple operations and less readable.