5 Best Ways to Count Consecutive Identical Elements in Python

πŸ’‘ Problem Formulation: In many algorithmic problems, it is required to find the count of consecutive identical elements in a list or a string. For instance, given the input ["a", "a", "b", "b", "b", "a"], the desired output is a count of consecutive repetition which would be [2, 3, 1], indicating that “a” repeats 2 times, then “b” repeats 3 times, followed by “a” with 1 occurrence.

Method 1: Using Itertools Groupby

This method includes using the groupby() function from Python’s itertools module to group the consecutive identical elements, and then just count the size of each group. It’s efficient and takes advantage of Python’s built-in functionality for handling iterables.

Here’s an example:

from itertools import groupby

def consecutive_elem_count(lst):
    return [sum(1 for _ in group) for _, group in groupby(lst)]

example_list = ["a", "a", "b", "b", "b", "a"]
print(consecutive_elem_count(example_list))

Output:

[2, 3, 1]

This snippet defines a function consecutive_elem_count() which processes a list. It utilizes groupby() to group consecutive identical elements and computes their count using a generator expression within a list comprehension.

Method 2: Naive Loop Approach

The naive loop approach involves iterating over the list of elements, keeping track of consecutive repeats, and updating the count whenever a change in element is observed. This approach is straightforward and doesn’t require any special Python modules.

Here’s an example:

def consecutive_elem_count(lst):
    if not lst: return []
    count_list, count = [], 1
    for i in range(1, len(lst)):
        if lst[i] == lst[i-1]:
            count += 1
        else:
            count_list.append(count)
            count = 1
    count_list.append(count)
    return count_list

example_list = ["a", "a", "b", "b", "b", "a"]
print(consecutive_elem_count(example_list))

Output:

[2, 3, 1]

The code defines a function consecutive_elem_count() which goes through the list using a for loop, tracking consecutive repeats and storing their counts to a separate list, which is then returned.

Method 3: Using Regular Expressions

Regular expressions can be used to find consecutive identical characters in a string by matching patterns. This method is powerful as regex offers a lot of flexibility and is very concise for pattern matching tasks.

Here’s an example:

import re

def consecutive_elem_count(s):
    return [len(match) for match in re.findall(r'(.)\1*', s)]

example_string = "aabbbba"
print(consecutive_elem_count(example_string))

Output:

[2, 4, 1]

This snippet creates the consecutive_elem_count() function that uses Regex’s findall() function to detect groups of repeating characters in a string and counts them.

Method 4: Using Numpy

Numpy, a powerful numerical processing library, can be used for counting consecutive elements by leveraging boolean masking and the diff function, useful on numerical data.

Here’s an example:

import numpy as np

def consecutive_elem_count(arr):
    diff_arr = np.diff(arr, prepend=np.nan)
    return np.diff(np.where(diff_arr != 0)[0], append=arr.size)

example_array = np.array(["a", "a", "b", "b", "b", "a"])
print(consecutive_elem_count(example_array))

Output:

[2 3 1]

The function consecutive_elem_count() computes the differences in consecutive array elements and then counts the occurrences before each change. This method exploits the efficiency of Numpy’s vectorized operations.

Bonus One-Liner Method 5: List Comprehension with zip

Python’s list comprehension feature, combined with the zip function, can help find consecutive elements count in a very succinct way.

Here’s an example:

lst = ["a", "a", "b", "b", "b", "a"]
consecutive_counts = [sum(1 for _ in group) for group in (list(g) for k, g in zip(lst, lst[1:] + lst[:1]) if k == g)]

print(consecutive_counts)

Output:

[2, 3, 1]

This one-liner uses list comprehension to iterate over paired elements and count consecutive matches. Note that it builds temporary lists to count iterations, making it less efficient.

Summary/Discussion

  • Method 1: Groupby from Itertools. Efficient and Pythonic. Sticks with built-in libraries. Not as easy to read for beginners.
  • Method 2: Naive Loop Approach. Straightforward, easy to understand. Can be slow for large lists. No third-party libraries required.
  • Method 3: Using Regular Expressions. Powerful pattern matching. Less efficient for non-string data. Might be overkill for simple lists.
  • Method 4: Using Numpy. Fast for numerical data. Overhead for importing a large library. Not suitable for small or single-use scripts.
  • Bonus Method 5: List Comprehension with zip. One-liner, elegant. Less efficient due to temporary list construction. Might be tricky to understand at first glance.