5 Best Ways to Count Unique Sublists Within a List in Python

πŸ’‘ Problem Formulation: When working with lists in Python, understanding how to effectively count unique sublists is a common challenge. Suppose we have a list of sublists, like [['apple', 'banana'], ['banana', 'apple'], ['apple', 'banana'], ['orange']], and we want to count how many distinct sublists exist, disregarding the order of elements within sublists. Our desired output for this example would be 2 unique sublists: one for [‘apple’, ‘banana’] and one for [‘orange’].

Method 1: Using a Custom Hash Function

This method involves creating a hash function that interprets each sublist as an immutable set, which allows for the identification of unique sublists where the order is irrelevant. The uniqueness is determined by a hash value that the custom function generates for each sublist.

Here’s an example:

def hashable_sublist(sublist):
    return tuple(sorted(sublist))

list_of_sublists = [['apple', 'banana'], ['banana', 'apple'], ['apple', 'banana'], ['orange']]
unique_sublists = set(map(hashable_sublist, list_of_sublists))
unique_count = len(unique_sublists)
print(unique_count)

Output:

2

This snippet creates a function hashable_sublist() that sorts and converts each sublist into a tuple, which is hashable. It then maps this function over the list of sublists, converting them into a set of unique elements (in this case, hashable tuples). Lastly, it prints the length of the set to find the number of unique sublists.

Method 2: Using frozenset

With this method, you utilize frozenset, a built-in immutable set-like collection in Python. It turns each sublist into a frozenset that can be directly used in a set for uniqueness determination.

Here’s an example:

list_of_sublists = [['apple', 'banana'], ['banana', 'apple'], ['apple', 'banana'], ['orange']]
unique_sublists = set(frozenset(sublist) for sublist in list_of_sublists)
unique_count = len(unique_sublists)
print(unique_count)

Output:

2

This code converts each sublist into a frozenset, which is then stored in a set to automatically discard duplicates. The number of unique sublists is acquired by measuring the size of this set.

Method 3: Using Collections Counter

The collections module contains a Counter class that can count hashable objects. When combined with a function to convert sublists to an immutable form, it can be used to count unique sublists.

Here’s an example:

from collections import Counter

def hashable_sublist(sublist):
    return frozenset(sublist)

list_of_sublists = [['apple', 'banana'], ['banana', 'apple'], ['apple', 'banana'], ['orange']]
sublists_counter = Counter(map(hashable_sublist, list_of_sublists))
unique_count = len(sublists_counter)
print(unique_count)

Output:

2

In this example, we use the Counter class from the collections module in combination with a function that makes each sublist hashable by converting it to a frozenset. The Counter then easily counts unique items, providing the count of unique sublists.

Method 4: Custom Function with List Comprehension

Implementing a custom function that leverages list comprehensions can yield a straightforward counting solution. It requires more lines but can be transparent in terms of the operation it performs on the sublists.

Here’s an example:

def count_unique_sublists(lst):
    unique_sublists = []
    for sublist in lst:
        fs = frozenset(sublist)
        if fs not in unique_sublists:
            unique_sublists.append(fs)
    return len(unique_sublists)

list_of_sublists = [['apple', 'banana'], ['banana', 'apple'], ['apple', 'banana'], ['orange']]
unique_count = count_unique_sublists(list_of_sublists)
print(unique_count)

Output:

2

Here, we’ve defined a function count_unique_sublists() that iterates over each sublist, converting it to a frozenset and then appending it to a unique_sublists list if it’s not already present. The final tally of unique sublists is the length of this list.

Bonus One-Liner Method 5: Using Set Comprehension

A set comprehension provides a concise one-liner solution to the problem by combining elements of the previous methods.

Here’s an example:

list_of_sublists = [['apple', 'banana'], ['banana', 'apple'], ['apple', 'banana'], ['orange']]
unique_count = len({frozenset(sublist) for sublist in list_of_sublists})
print(unique_count)

Output:

2

This one-liner creates a set comprehension by transforming each sublist into a frozenset, hence counting unique sublists directly by checking the length of the set created.

Summary/Discussion

  • Method 1: Custom Hash Function. Flexible and clear. Requires a custom function which can be a bit verbose.
  • Method 2: Using frozenset. Simple and Pythonic. Works best for smaller lists due to potential performance issues with larger datasets.
  • Method 3: Using Collections Counter. Useful for additional statistics beyond just counting. Slightly more complex due to dependency on the collections module.
  • Method 4: Custom Function with List Comprehension. Very explicit, but potentially inefficient with large lists as it iteratively checks for membership.
  • Method 5: Using Set Comprehension. The most succinct. Perfect for one-off calculations where code brevity is preferred.