5 Best Ways to Count Frequency of Sublist in a Given List Using Python

πŸ’‘ Problem Formulation: In Python programming, a common task is to determine how often a specific sublist appears within a larger list. This can be useful in various applications such as data analysis, pattern recognition, and algorithm development. For example, if our input is [1, 2, 1, 2, 3, 4, 1, 2] and the sublist we are looking for is [1, 2], the desired output would be 2 as the sublist [1, 2] appears twice in the given list.

Method 1: Using a for-loop and slicing

This method involves iterating through the parent list and using list slicing to match the sublist. It’s a straightforward approach which doesn’t require any additional modules and is thus easily readable and implementable.

Here’s an example:

def count_sublist_frequency(lst, sublist):
    count = 0
    sub_len = len(sublist)
    for i in range(len(lst) - sub_len + 1):
        if lst[i:i + sub_len] == sublist:
            count += 1
    return count

# Example usage
main_list = [1, 2, 1, 2, 3, 4, 1, 2]
sub_list = [1, 2]
print(count_sublist_frequency(main_list, sub_list))

Output: 2

In the provided code snippet, the function count_sublist_frequency() accepts the main list and the sublist to search for. It initializes a counter to zero, then iteratively checks slices of the main list, each the length of the sublist, incrementing the counter every time a match is found. The final result is the count of the frequency at which the sublist occurs.

Method 2: Using Collections Module

The collections module provides specialized container datatypes. In this case, Collections.Counter() can be employed to count hashable objects. It can be used to count slices of the list as well, helping determine the frequency of sublists.

Here’s an example:

from collections import Counter

def count_sublist_frequency(lst, sublist):
    sub_len = len(sublist)
    slices = [tuple(lst[i:i + sub_len]) for i in range(len(lst) - sub_len + 1)]
    return Counter(slices)[tuple(sublist)]

# Example usage
main_list = [1, 2, 1, 2, 3, 4, 1, 2]
sub_list = [1, 2]
print(count_sublist_frequency(main_list, sub_list))

Output: 2

The function count_sublist_frequency() creates all possible slices of the list that are the same length as the sublist, converting them to tuples which are hashable and hence can be counted using Counter. It then directly accesses the count for the given tuple equivalent of the sublist using the generated counter dictionary.

Method 3: Using itertools’ groupby for consecutive sublists

Python’s itertools.groupby() function can be handy when counting consecutive sublists. This approach is efficient for finding repeating patterns directly after each other in a sequence.

Here’s an example:

from itertools import groupby

def count_consecutive_sublist(lst, sublist):
    count = 0
    consecutive_groups = (len(list(group)) for _, group in groupby(lst))
    return sum(sub_len == sublist_length for sub_len in consecutive_groups)

# Example usage
main_list = [1, 1, 2, 2, 2, 1, 2, 1, 1, 2, 2]
sub_list = [1, 2]
print(count_consecutive_sublist(main_list, sub_list))

Output: 3

This approach differs slightly, as it looks specifically for consecutive occurrences of elements of the sublist within the main list. The function count_consecutive_sublist() uses groupby() from the itertools module to group repeating elements together, counts the length of each group, and then sums the instances where this length matches the length of the sublist.

Method 4: Using list comprehension and sum function

This Pythonic way leverages the succinctness of list comprehension combined with the sum() function to count the frequency of a sublist. It’s a more condensed and often faster technique for smaller lists.

Here’s an example:

def count_sublist_frequency(lst, sublist):
    sub_len = len(sublist)
    return sum(lst[i:i + sub_len] == sublist for i in range(len(lst) - sub_len + 1))

# Example usage
main_list = [1, 2, 1, 2, 3, 4, 1, 2]
sub_list = [1, 2]
print(count_sublist_frequency(main_list, sub_list))

Output: 2

This streamlined code snippet defines a single-line function count_sublist_frequency() that uses a list comprehension to create a list of boolean values that represent whether a slice of the main list matches the sublist, then simply sums these boolean values to get the total count of matches.

Bonus One-Liner Method 5: Using a generator expression

For those who enjoy concise, functional programming style, a generator expression provides an elegant alternative. A generator expression for this problem reduces memory usage, as it avoids the creation of an intermediate list.

Here’s an example:

count_sublist_frequency = lambda lst, sublist: sum(1 for i in range(len(lst) - len(sublist) + 1) if lst[i:i+len(sublist)] == sublist)

# Example usage
main_list = [1, 2, 1, 2, 3, 4, 1, 2]
sub_list = [1, 2]
print(count_sublist_frequency(main_list, sub_list))

Output: 2

This example features a lambda function that uses a generator expression, iterating over indices and yielding a count of 1 each time a matching slice is found. The sum() function then iterates over this generator to add up the counts, providing the total frequency of the sublist.

Summary/Discussion

  • Method 1: For-loop and Slicing. Simple and clear. No external dependencies. May be less efficient with very large lists.
  • Method 2: Collections Module. Utilizes powerful standard library modules. Efficient for repeated sublist searches. Slightly more complex.
  • Method 3: itertools.groupby() for Consecutive Sublists. Best for counting consecutive patterns. Less flexible for non-consecutive sublists.
  • Method 4: List Comprehension and Sum. Pythonic and concise. Efficient for smaller lists. Readability may suffer for those unfamiliar with the style.
  • Method 5: Generator Expression. Memory efficient. Functional style that is very concise. Could be less readable for beginners.