π‘ Problem Formulation: You have a list of words, and your goal is to group the words that are anagrams of one another. An anagram is a word or phrase that can be formed by rearranging the letters of a different word or phrase, typically using all the original letters exactly once. For instance, the input ['bat', 'tab', 'cat', 'act', 'tac'] should yield an output such as [['bat', 'tab'], ['cat', 'act', 'tac']].
Method 1: Using Defaultdict and Sorted
This method involves using a defaultdict from the collections module to group words by the sorted tuple of their characters. This works because anagrams will have identical sorted tuples. It is a simple yet effective way to group anagrams in Python.
Here’s an example:
import collections
def group_anagrams(words):
anagrams = collections.defaultdict(list)
for word in words:
anagrams[tuple(sorted(word))].append(word)
return list(anagrams.values())
print(group_anagrams(['bat', 'tab', 'cat', 'act', 'tac']))
Output: [['bat', 'tab'], ['cat', 'act', 'tac']]
This code snippet creates a defaultdict where each key is a tuple of sorted letters, and the corresponding value is a list of words that, when sorted, match this tuple. The function returns a list of these groups.
Method 2: Using Counter from Collections
The Counter from the collections module can be used to create a hashable count representation of each word’s letters. This makes it possible to match anagrams by their letter count, a unique solution for words with the same frequency of characters.
Here’s an example:
from collections import Counter
def group_anagrams(words):
anagrams = collections.defaultdict(list)
for word in words:
anagrams[frozenset(Counter(word).items())].append(word)
return list(anagrams.values())
print(group_anagrams(['bat', 'tab', 'cat', 'act', 'tac']))
Output: [['bat', 'tab'], ['cat', 'act', 'tac']]
In this approach, we create a defaultdict and populate it with a frozenset of counters of each word. Since Counter objects are not hashable by default, we convert them into a frozenset, which serves as the key in our defaultdict.
Method 3: Grouping by Prime Number Multiplication
This method maps each letter to a unique prime number and multiplies these to get a product that will uniquely represent an anagram class. The use of prime numerals ensures that the product will be the same for anagrams and different for non-anagrams.
Here’s an example:
def prime_product(word):
primes = {'a': 2, 'b': 3, 'c': 5, 'd': 7, 'e': 11, 'f': 13, 'g': 17, 'h': 19, 'i': 23,
'j': 29, 'k': 31, 'l': 37, 'm': 41, 'n': 43, 'o': 47, 'p': 53, 'q': 59, 'r': 61,
's': 67, 't': 71, 'u': 73, 'v': 79, 'w': 83, 'x': 89, 'y': 97, 'z': 101}
prod = 1
for char in word:
prod *= primes[char]
return prod
def group_anagrams(words):
anagrams = collections.defaultdict(list)
for word in words:
anagrams[prime_product(word)].append(word)
return list(anagrams.values())
print(group_anagrams(['bat', 'tab', 'cat', 'act', 'tac']))
Output: [['bat', 'tab'], ['cat', 'act', 'tac']]
Each letter is assigned a unique prime number and the product of these primes for each word is computed using the prime_product function. Anagrams will have the same prime product, thus are grouped together in the defaultdict.
Method 4: Using itertools and sorted()
This method makes use of itertools’ groupby function to group words with identical signatures, where a signature is the sorted word. It is a functional programming-inspired technique that focuses on transforming the list of words to a stream of grouped anagrams.
Here’s an example:
from itertools import groupby
def group_anagrams(words):
sorted_words = sorted(words, key=sorted)
grouped = groupby(sorted_words, sorted)
return [list(group) for _, group in grouped]
print(group_anagrams(['bat', 'tab', 'cat', 'act', 'tac']))
Output: [['bat', 'tab'], ['cat', 'act', 'tac']]
The groupby function is used to group the list of words that have been sorted alphabetically. The sorted words serve as iterators for the groupby function, and the result is a list of lists containing the anagrams.
Bonus One-Liner Method 5: Using List Comprehension and sorted()
A concise one-liner approach is to use list comprehension combined with sorted(). This method makes use of Python’s ability to express complex operations in a compact manner but may be less readable for those unfamiliar with list comprehensions.
Here’s an example:
group_anagrams = lambda words: [list(group) for key, group in itertools.groupby(sorted(words, key=sorted), sorted)]
print(group_anagrams(['bat', 'tab', 'cat', 'act', 'tac']))
Output: [['bat', 'tab'], ['cat', 'act', 'tac']]
This is a one-liner lambda function compressing the functionality of Method 4 into a single statement. While compact, it leverages sorting and grouping in one expression.
Summary/Discussion
- Method 1: Using Defaultdict and Sorted. Strengths: Intuitive and straightforward to implement. Weaknesses: Requires additional memory for the dictionary.
- Method 2: Using Counter from Collections. Strengths: Efficient for words with many repeating characters. Weaknesses: Sub-optimal for words with unique characters due to Counter overhead.
- Method 3: Grouping by Prime Number Multiplication. Strengths: Highly efficient and no sorting required. Weaknesses: Need to define a map between characters and prime numbers, which can become cumbersome for extended alphabets.
- Method 4: Using itertools and sorted(). Strengths: Elegant functional programming approach. Weaknesses: Possibly less intuitive for those unfamiliar with itertools.
- Method 5: Bonus One-Liner Using List Comprehension and sorted(). Strengths: Extremely concise. Weaknesses: Potentially difficult to read and understand at a glance.
