Understanding Generators In Python

If there is one subject in Python that creates confusion, it is that of generators. Generators are functions, but they have several differences from the ordinary functions you and I use daily. Today, we’ll be taking a gentle immersion into the world of generators to understand what they are, how they differ from ordinary functions, and why or when we might use them.

As you go through the article, feel free to listen to the following explainer video:

Normal Functions

This next point is essential to understand the power of generators versus ordinary functions. A normal function generates the entire sequence of operations in memory before returning a result. We call it, it carries out a task or set of tasks and then returns the function’s output. Once the 'return' statement is executed, the function terminates, the memory is cleared, and the variables and functions used are forgotten.

def multiply(num):
    total = num * 52
    return total

print(multiply(6))

# Result
312

In the above code, the multiply() function is called, executes the equation, returns the result, and it’s all over. If I call a print on the variable 'total‘ after executing this function, I’ll get an error message. The function has done its work, returned the data, and there is nothing left to query.

def multiply(num):
    total = num * 52
    return total

print(multiply(6))

print(total)

# Result

312

Traceback (most recent call last):
  File "C:\Users\David\Desktop\Upwork Platform\Generators\OrdFunction.py", line 8, in <module>
    print(total)
NameError: name 'total' is not defined

Generator Definition

Yet, if we define a generator, it is a function that returns an object when called, which we can then process by calling one item at a time. To do this, we use a couple of specific commands. Let’s look at ‘yield‘ and next().

The Yield and Next Statements

In Python, yield is a statement that returns data from a function without terminating the function and without forgetting the variables. Think of yield as a bit like a pause button. It pauses the function, passes the data, and then waits. When you ‘unpause’ the function, it will continue from where it left off.


So here is the first distinction between generator functions and standard functions. For a function to be a generator, there must be at least one ‘yield‘ statement. There may be more than one yield statement, and there may also be return statements. Yet, without at least one yield statement, it’s not a generator.


So how do you unpause the function? That’s where we need to understand the next() function. The next() function is the upause button we were speaking of previously. Here’s some code to show how next(), and yield, work.

def multiply(num):
    mult = num * 52
    yield mult

    add = mult + 185
    yield add

    subt = add - 76
    yield subt

test = multiply(6)

print(next(test))
print(next(test))
print(next(test))

# Result

312
497
421

In the previous code, we activate the function multiply(), and assign it to a variable ‘test‘. We then call next() on test, which runs through the programme until it reaches the first yield, then it supplies us with the value 312, and then it waits. When we unpause the function with the second next(), it starts up where it left off, with all information still available to it, evaluates the next code and pauses at the second yield, where it supplies us with the value 497. The third and final next() will supply us with 421, the data held by subt.


Now, what would happen if we call a fourth next() even when we know there are no other values to be returned?

...
print(next(test))
print(next(test))
print(next(test))
print(next(test))

# Result

  File "C:\Users\David\Desktop\Upwork Platform\Generators\GeneratorsEx1.py", line 17, in <module>
    print(next(test))
StopIteration
312
497
421


The process of returning the values is a one-way street; once you run out of values, you’ll get the ‘StopIteration‘ exception, and Python will return no other values.

What Is The Point Of Generators?

Now that you understand we can pause a function using yield while retaining all the details within the function, we can discuss why we use generators. The power of a generator is that it allows us to evaluate and call a value only when we need it, making generators extremely useful when iterating, or looping, through an iterable.

Let’s Learn Some Jargon – Iterable

When you start to learn about generators, the first hurdle you hit is a sentence like the one following in italics, which is an experienced coder’s response to a new coder wanting a simple explanation for the word ‘iterable’.

“An iterable is an object that has an __iter__ method which returns an iterator, or which defines a __getitem__ method that can take sequential indexes starting from zero (and raises an IndexError when the indexes are no longer valid). So an iterable is an object that you can get an iterator from.”

Yep. Clear as mud. Thanks for that. Glad I asked.

So to clearly understand, we will begin by learning four words; Iterate, Iteration, Iterator, and Iterable.

  • Iterate: To iterate something is to repeat something. So to iterate is to repeat a process, task, or instruction. Iterate is a verb.
  • Iteration: This is the process you carry out when repeating something over and over again. Iteration is what you are doing when you iterate. Iteration is a noun.
  • Iterator: In Python, an iterator is an object applied to a collection of data and will return one element at a time during the iteration process.
  • Iterable: Is a collection of elements. By definition, it means something able to be iterated over; an object capable of returning its elements one at a time. A list in Python is considered iterable.

So to summarise, an iterator, iterates, through an iterable, in the process of iteration. Clear? Do I need to reiterate? No? Great! Moving on. 🙂

Full Article: Iterators, Iterables, and Itertools

What Is The Point Of Generators?

Now you understand that we can pause a function using yield while retaining all the details within the function, we can discuss why we use generators. The power of a generator is that it allows us to evaluate a dataset and call a value only when we need it, making generators extremely useful when iterating, or looping, through an iterable.

A generator is a lazy iterator, which means that when faced with an extensive data collection, rather than load the entire data set into memory, a generator allows each element in the dataset to be evaluated and returned one by one, and only when called. With the size of some datasets we come across, in the worst case, we would exceed available memory if we attempted to load the entire thing; the best case is to slow processing ability dramatically.

Unlike a function, a generator uses considerably less memory, given that it evaluates and produces only one item at a time.

Used In Loops

We can easily use generators in for-loops. That is because for-loops abstract iteration by using the next() command in the background, and by their nature, they provide specific methods to prevent triggering the StopIteration exception. In this code block, we’ll run the previous code with a for-loop.

def multiply(num):
    mult = num * 52
    yield mult

    add = mult + 185
    yield add

    subt = add - 76
    yield subt

for item in multiply(6):
    print(item)

# Result

312
497
421

Anonymous Generators

Rather than formally define generators, we can create them using an in-line expression if we need to use the function once and forget it. Rather like lambda expressions, which are anonymous functions, we can create anonymous generators. The process is similar to using a one-liner list comprehension, except rather than use square bracket notation, we use rounded parentheses.

We’ll create a generator object in the following code, then call it using the next() command.

numbers = [1, 3, 5, 7, 9, 2, 4, 6, 8]

result = ((x*6)//2 for x in numbers) 

print(result, '\n')

print(next(result))
print(next(result))
print(next(result))
print(next(result))

# Result

<generator object <genexpr> at 0x000001F6C9E7B9E0> 

3 
9 
15 
21 

Note that you also pass anonymous generators to functions.

numbers = [1, 3, 5, 7, 9, 2, 4, 6, 8]

print(max((x*6)//2 for x in numbers))

# Result

27

In Summary

We achieve this step-by-step iteration using the yield statement, which ‘pauses’ the generator function until the next() method calls for the subsequent data.


Generators will only iterate once and in one direction; you cannot back up in the process to access earlier values. Once a generator has concluded, you need to create a new generator object should you wish to reiterate it.


Unlike normal functions, generators are highly memory-efficient, mainly when used with large data sets, as they only load and evaluate individual values once called.


We often use generators in loops where specific conditions terminate the calls, avoiding the StopIteration exception.


We can create anonymous generators in-line, using round brackets, where a one-off use precludes full definition.


Generators are an easy and concise method of creating an iterator, rather than creating a class and using the __iter__() and __next__() methods.

I trust this article was helpful in understanding what generators are, where we’d use then and the value they provide. Thanks for reading.