5 Best Ways to Break a List Into Chunks of Size N in Python

πŸ’‘ Problem Formulation: In many programming scenarios, we need to process a large list by breaking it into smaller sub-lists (chunks) of a specific size. For instance, suppose you have a list of 120 elements and you want to break it down into chunks of size 10, resulting in 12 sub-lists each containing 10 elements.

Method 1: Using a For Loop

This method entails using a simple for loop to iterate over the original list and extract sub-lists of the desired size. This approach is very readable and straightforward. The function specification would involve a list input and the chunk size, returning a list of sub-lists.

Here’s an example:

def chunk_list(lst, n):
    for i in range(0, len(lst), n):
        yield lst[i:i + n]

example_list = [i for i in range(50)]
chunks = list(chunk_list(example_list, 5))

print(chunks)

Output:

[[0, 1, 2, 3, 4], [5, 6, 7, 8, 9], ..., [45, 46, 47, 48, 49]]

In this code snippet, the function chunk_list() is defined to take a list and a chunk size n. It then yields sub-lists of size n using list slicing. This method is encapsulated in a generator function, which is memory-efficient when dealing with large lists.

Method 2: Using the list comprehension

List comprehension in Python provides a compact way of creating lists. The same technique can be applied to break a list into chunks. It’s readable and concise, making it easy for other programmers to understand what’s happening.

Here’s an example:

def chunk_list(lst, n):
    return [lst[i:i + n] for i in range(0, len(lst), n)]

example_list = [i for i in range(50)]
chunks = chunk_list(example_list, 5)

print(chunks)

Output:

[[0, 1, 2, 3, 4], [5, 6, 7, 8, 9], ..., [45, 46, 47, 48, 49]]

This code snippet demonstrates the use of list comprehension to create a new list that contains sub-lists, each with a length determined by n. This approach is highly readable and is typically faster than a loop-based method.

Method 3: Using NumPy’s array_split Function

If you’re working within the ecosystem of scientific computing in Python, NumPy’s array_split function is an optimal way to divide a list into evenly or almost evenly sized chunks. NumPy is designed for high-performance array operations and is well-suited for this task.

Here’s an example:

import numpy as np

example_list = np.arange(50)
chunks = np.array_split(example_list, 10)

print(chunks)

Output:

[array([0, 1, 2, 3, 4]), ..., array([45, 46, 47, 48, 49])]

This code utilizes NumPy’s np.array_split function to break an array into a specified number of chunks. Since NumPy arrays are more memory-efficient than Python lists for large data sets, this method is particularly powerful for large scale list chunking. However, it requires having NumPy installed.

Method 4: Using itertools.islice

In Python’s itertools library, the islice function is used to slice iterators in an efficient way. When breaking a large list into chunks, itertools.islice can be leveraged to perform this task without copying the list, which saves memory.

Here’s an example:

from itertools import islice

def chunk_list(iterable, n):
    it = iter(iterable)
    return iter(lambda: list(islice(it, n)), [])

example_list = [i for i in range(50)]
chunks = list(chunk_list(example_list, 5))

print(chunks)

Output:

[[0, 1, 2, 3, 4], [5, 6, 7, 8, 9], ..., [45, 46, 47, 48, 49]]

The code snippet employs the islice method from the itertools module to create an iterator that returns fixed-size chunks from the list. This iterator is then cast back into a list, retaining the advantages of lazy evaluation while being practical to use.

Bonus One-Liner Method 5: Using recursion

Recursion isn’t usually the most efficient way due to its stack usage, but it can be an elegant one-liner solution for breaking a list into chunks, especially for small to medium-sized lists.

Here’s an example:

chunk_list = lambda lst, n: [] if not lst else [lst[:n]] + chunk_list(lst[n:], n)

example_list = [i for i in range(50)]
chunks = chunk_list(example_list, 5)

print(chunks)

Output:

[[0, 1, 2, 3, 4], [5, 6, 7, 8, 9], ..., [45, 46, 47, 48, 49]]

This one-liner utilizes a recursive function to keep breaking down the list into the first n items and the remainder, appending these chunks until the list is exhausted. While concise and clever, this method may not be as clear or efficient for larger lists due to the limits of recursion.

Summary/Discussion

  • Method 1: For Loop. Simple and memory-efficient thanks to generators. May be slower for large lists.
  • Method 2: List Comprehension. Clean and concise. Generally faster but consumes more memory all at once.
  • Method 3: NumPy’s array_split. Most efficient for large data sets. Requires NumPy library.
  • Method 4: itertools.islice. Memory-efficient iterator slicing. Requires understanding of iterators and may be less intuitive.
  • Method 5: Recursion. Elegant and concise. Not suitable for large lists due to recursion depth limits and can be inefficient memory-wise.