If there is one subject in Python that creates confusion, it is that of generators. Generators are functions, but they have several differences from the ordinary functions you and I use daily. Today, we’ll be taking a gentle immersion into the world of generators to understand what they are, how they differ from ordinary functions, and why or when we might use them.
As you go through the article, feel free to listen to the following explainer video:
This next point is essential to understand the power of generators versus ordinary functions. A normal function generates the entire sequence of operations in memory before returning a result. We call it, it carries out a task or set of tasks and then returns the function’s output. Once the
'return' statement is executed, the function terminates, the memory is cleared, and the variables and functions used are forgotten.
def multiply(num): total = num * 52 return total print(multiply(6)) # Result 312
In the above code, the
multiply() function is called, executes the equation, returns the result, and it’s all over. If I call a print on the variable
'total‘ after executing this function, I’ll get an error message. The function has done its work, returned the data, and there is nothing left to query.
def multiply(num): total = num * 52 return total print(multiply(6)) print(total) # Result 312 Traceback (most recent call last): File "C:\Users\David\Desktop\Upwork Platform\Generators\OrdFunction.py", line 8, in <module> print(total) NameError: name 'total' is not defined
Yet, if we define a generator, it is a function that returns an object when called, which we can then process by calling one item at a time. To do this, we use a couple of specific commands. Let’s look at ‘
The Yield and Next Statements
In Python, yield is a statement that returns data from a function without terminating the function and without forgetting the variables. Think of yield as a bit like a pause button. It pauses the function, passes the data, and then waits. When you ‘unpause’ the function, it will continue from where it left off.
So here is the first distinction between generator functions and standard functions. For a function to be a generator, there must be at least one ‘
yield‘ statement. There may be more than one
yield statement, and there may also be
return statements. Yet, without at least one
yield statement, it’s not a generator.
So how do you unpause the function? That’s where we need to understand the
next() function. The
next() function is the upause button we were speaking of previously. Here’s some code to show how
def multiply(num): mult = num * 52 yield mult add = mult + 185 yield add subt = add - 76 yield subt test = multiply(6) print(next(test)) print(next(test)) print(next(test)) # Result 312 497 421
In the previous code, we activate the function
multiply(), and assign it to a variable ‘
test‘. We then call
next() on test, which runs through the programme until it reaches the first
yield, then it supplies us with the value 312, and then it waits. When we unpause the function with the second
next(), it starts up where it left off, with all information still available to it, evaluates the next code and pauses at the second yield, where it supplies us with the value 497. The third and final
next() will supply us with 421, the data held by
Now, what would happen if we call a fourth
next() even when we know there are no other values to be returned?
... print(next(test)) print(next(test)) print(next(test)) print(next(test)) # Result File "C:\Users\David\Desktop\Upwork Platform\Generators\GeneratorsEx1.py", line 17, in <module> print(next(test)) StopIteration 312 497 421
The process of returning the values is a one-way street; once you run out of values, you’ll get the ‘
StopIteration‘ exception, and Python will return no other values.
What Is The Point Of Generators?
Now that you understand we can pause a function using yield while retaining all the details within the function, we can discuss why we use generators. The power of a generator is that it allows us to evaluate and call a value only when we need it, making generators extremely useful when iterating, or looping, through an iterable.
Let’s Learn Some Jargon – Iterable
When you start to learn about generators, the first hurdle you hit is a sentence like the one following in italics, which is an experienced coder’s response to a new coder wanting a simple explanation for the word ‘iterable’.
“An iterable is an object that has an __iter__ method which returns an iterator, or which defines a __getitem__ method that can take sequential indexes starting from zero (and raises an IndexError when the indexes are no longer valid). So an iterable is an object that you can get an iterator from.”
Yep. Clear as mud. Thanks for that. Glad I asked.
So to clearly understand, we will begin by learning four words; Iterate, Iteration, Iterator, and Iterable.
- Iterate: To iterate something is to repeat something. So to iterate is to repeat a process, task, or instruction. Iterate is a verb.
- Iteration: This is the process you carry out when repeating something over and over again. Iteration is what you are doing when you iterate. Iteration is a noun.
- Iterator: In Python, an iterator is an object applied to a collection of data and will return one element at a time during the iteration process.
- Iterable: Is a collection of elements. By definition, it means something able to be iterated over; an object capable of returning its elements one at a time. A list in Python is considered iterable.
So to summarise, an iterator, iterates, through an iterable, in the process of iteration. Clear? Do I need to reiterate? No? Great! Moving on. 🙂
Full Article: Iterators, Iterables, and Itertools
What Is The Point Of Generators?
Now you understand that we can pause a function using yield while retaining all the details within the function, we can discuss why we use generators. The power of a generator is that it allows us to evaluate a dataset and call a value only when we need it, making generators extremely useful when iterating, or looping, through an iterable.
A generator is a lazy iterator, which means that when faced with an extensive data collection, rather than load the entire data set into memory, a generator allows each element in the dataset to be evaluated and returned one by one, and only when called. With the size of some datasets we come across, in the worst case, we would exceed available memory if we attempted to load the entire thing; the best case is to slow processing ability dramatically.
Unlike a function, a generator uses considerably less memory, given that it evaluates and produces only one item at a time.
Used In Loops
We can easily use generators in for-loops. That is because for-loops abstract iteration by using the
next() command in the background, and by their nature, they provide specific methods to prevent triggering the StopIteration exception. In this code block, we’ll run the previous code with a for-loop.
def multiply(num): mult = num * 52 yield mult add = mult + 185 yield add subt = add - 76 yield subt for item in multiply(6): print(item) # Result 312 497 421
Rather than formally define generators, we can create them using an in-line expression if we need to use the function once and forget it. Rather like lambda expressions, which are anonymous functions, we can create anonymous generators. The process is similar to using a one-liner list comprehension, except rather than use square bracket notation, we use rounded parentheses.
We’ll create a generator object in the following code, then call it using the
numbers = [1, 3, 5, 7, 9, 2, 4, 6, 8] result = ((x*6)//2 for x in numbers) print(result, '\n') print(next(result)) print(next(result)) print(next(result)) print(next(result)) # Result <generator object <genexpr> at 0x000001F6C9E7B9E0> 3 9 15 21
Note that you also pass anonymous generators to functions.
numbers = [1, 3, 5, 7, 9, 2, 4, 6, 8] print(max((x*6)//2 for x in numbers)) # Result 27
We achieve this step-by-step iteration using the yield statement, which ‘pauses’ the generator function until the
next() method calls for the subsequent data.
Generators will only iterate once and in one direction; you cannot back up in the process to access earlier values. Once a generator has concluded, you need to create a new generator object should you wish to reiterate it.
Unlike normal functions, generators are highly memory-efficient, mainly when used with large data sets, as they only load and evaluate individual values once called.
We often use generators in loops where specific conditions terminate the calls, avoiding the
We can create anonymous generators in-line, using round brackets, where a one-off use precludes full definition.
Generators are an easy and concise method of creating an iterator, rather than creating a class and using the
I trust this article was helpful in understanding what generators are, where we’d use then and the value they provide. Thanks for reading.
David is a Python programmer and a technical writer creating in-depth articles for readers wanting uncomplicated explanations for topics made difficult by industry jargon. Also a woodworker, metalworker, landscape photographer, and pilot, he is freelance after 42 years in the corporate world. He has an MBA in Technology.