Python LangChain Course πŸπŸ¦œπŸ”— Summarizing Long Texts Using LangChain (1/6)

Welcome to this tutorial series on LangChain. My name is Dirk van Meerveld, and it is my pleasure to be your host and guide for this tutorial series!

Python LangChain Course πŸπŸ¦œπŸ”—

We’re going to be using LangChain to further improve our ChatGPT superpowers and explore more cool ways in which we can put these powerful technologies to practical use.

This tutorial series assumes a basic knowledge of ChatGPT and how to use it. It will also help if you understand ChatGPT function calls and embeddings as it will help you really understand what is going on below the hood. While we will cover these at a basic level as they come up, it may seem a bit magical and abstract if you don’t fully understand what is going on.

πŸ’‘ Note: For basic ChatGPT, you can check out my ‘Giggle Search’ tutorial series (which also teaches Django), and for function calls and embeddings, you can check out my ‘ChatGPT function calls and embeddings’ tutorial series, both available here on the Finxter Academy.

So, What is LangChain?

LangChain is a library that makes it easier to work with large language models (LLMs).

Anything we can do with LangChain, we can technically also do without it. It’s just that it makes our lives a lot easier. LangChain helps us

  • set up our prompts,
  • conversation history,
  • reusable chains of LLM calls, and
  • even agents that use the LLM to reason and make decisions.

One of the other advantages is that our code will not be dependent on a specific LLM, so in most cases, we can actually use precisely the same code with a different LLM without changing much about the code.

LangChain also provides a large amount of tools for us to use to load data such as document loaders, transform data, such as text splitters, and store data, for example in a vector database.

It also contains loads of tools that we can give to our LLM agent for the LLM to use, kind of like function calls. An example would be to give our LLM a Wikipedia tool and then our LLM agent will be able to search Wikipedia for more information on any subject if it needs such information to adequately answer our questions.

πŸ’‘ Note: You can watch the full course video right here on the blog — I’ll embedd the video below each of the other parts as well. If you want the step-by-step course with code and downloadable PDF course certificate to show your employer or freelancing clients. follow this link to learn more.

Let’s Get Started with a Simple Example

We can talk a lot about what LangChain is in theory, but it’s much better to get a feel for it by actually using it! πŸ§‘β€πŸ’»

So let’s get into the practical examples and as you go through each tutorial section you will get a better feel for what LangChain is and how to use it.

Before we get into the challenging examples, let’s do a very basic LangChain LLM call just to get our feet wet and get familiar with LangChain. Create a new project folder and open it in your favorite code editor (I’ll be using VS Code throughout this series). Then create the folder for part 1 inside.

I’m going to name it ‘1_Summarizing_long_texts‘ as we’re going to be focussing on summarizing in a little bit.

Inside this folder create a new Python file named ‘1_basic_call.py‘. My folder structure will look as follows:

πŸ“Finx_LangChain (root project folder)
    πŸ“1_Summarizing_long_texts
        πŸ“„1_basic_call.py

As we will be covering different subjects I will not combine our modules as a single project but keep them in separate folders, so you can easily refer back to any parts later if you need. Naturally, this is not a good practice for general software development, so keep that in mind.

Now first open a new terminal window and run the following command to install the needed libraries/dependencies:

pip install openai tiktoken langchain python-decouple

You’ll probably have openai already installed for working with the openai APIs. If not, follow the Finxter blog here:

πŸ§‘β€πŸ’» Recommended: How to Install OpenAI API?

Tiktoken will help us calculate the amount of tokens later and the langchain library is the focus for this whole tutorial series!

Python-decouple will allow us to load our API keys from a .env file.

A ChatGPT call in LangChain

Now open your 1_basic_call.py file and let’s get started by adding the following two imports up top:

from langchain.prompts import ChatPromptTemplate
from langchain.chat_models import ChatOpenAI

First, we import the ChatPromptTemplate class from the LangChain library, which will allow us to create the prompt we want to feed to ChatGPT (or another LLM). Secondly, we import ChatOpenAI from the LangChain library, as we will be using ChatGPT for our calls in this course.

To use this class we need to have the openai library installed (it was in the install command earlier).

Back to our basic_call.py file. Supplement the code as follows:

from langchain.prompts import ChatPromptTemplate
from langchain.chat_models import ChatOpenAI

prompt = ChatPromptTemplate.from_template(
    "Please tell me which foods are famous in {place}"
)
model = ChatOpenAI()
chain = prompt | model

First, we define the prompt we will give to LangChain.

For this, we will use the ChatPromptTemplate.from_template method, and feed it a simple string for now. Notice the {place} in-between brackets. We will be able to insert variable input here to dynamically change our prompt on each call.

We set the model as ChatOpenAI, since LangChain can work with other LLMs, so we need to specify which we want to use.

Then we create our first simple ‘chain‘ by using the | pipe operator to connect our prompt to our model. Think of it as piping the output of the first variable as input into the second, like Bash scripting, connecting them together in a chain.

These chains are a major concept in LangChain and will keep coming up in the future.

Setting Our API Key

Before we start running any code we must provide our openai API key, as LangChain will call the openai library under the hood, which will, of course, need our API key to make requests to ChatGPT on our behalf.

Feel free to use an environment variable named OPENAI_API_KEY if you’re familiar with this, and the openai library will pick up on it automatically in the background.

As environment variables can be a bit tricky with differences in setup for different systems and OS combinations, and I don’t want anybody to get stuck in endless system-specific debugging, I’ll take a simple approach using the python-decouple package that will work universally and still keep our API key out of our source code. We already installed it in the install command earlier.

Now create a .env, which is simply a file by the name of '.env' with no name but only the extension of .env, in the base directory and add the following line to it:

OPENAI_API_KEY=yoursuperdupersecretapikeygoeshere

Make sure to insert your own API key from Openai.com, or sign up for an account if you do not have one yet. Also, do not use any spaces as it will not work for .env files.

πŸ’‘Tip: If you're using GIT on this project, make sure to add '.env' to your .gitignore file so you don't commit your API key to your repository. If you're not yet familiar with GIT/version control, don't worry about this message.

Make sure the .env file is saved in the base directory with your folder structure as follows:

πŸ“Finx_LangChain
    πŸ“1_Summarizing_long_texts
        πŸ“„basic_call.py
    πŸ“„.env

Making our first call

Now we can finish up our basic_call.py file as follows:

from langchain.prompts import ChatPromptTemplate
from langchain.chat_models import ChatOpenAI
from decouple import config


prompt = ChatPromptTemplate.from_template(
    "Please tell me which foods are famous in {place}"
)
model = ChatOpenAI(openai_api_key=config("OPENAI_API_KEY"))
chain = prompt | model

All we added was an import, importing config from the decouple package, and when defining the model variable we passed in our API key as argument by calling the config function and passing in the name of our variable in the .env file as a string.

Our chain variable is now a ‘RunnableSequence‘ object returned to us by LangChain, we can run this sequence by calling its .invoke method, adding the following line below our code:

result = chain.invoke({"place": "France"})
print(result.content)

Notice that we pass in a dictionary that has the keys that correspond with the {place} we had in our template. Now, if you go ahead and run the file in your terminal, you will get something along the following lines:

France is known for its rich culinary traditions and has several famous foods. Some of the most renowned dishes and food items in France include:

1. Baguette: The iconic French bread, characterized by its long, thin shape and crispy crust.
2. Croissant: A buttery, flaky pastry often enjoyed for breakfast or as a snack.
3. Escargots de Bourgogne: Snails cooked in garlic butter and typically served as an appetizer.
4. Coq au Vin: A classic French dish made with chicken, red wine, mushrooms, onions, and bacon.
5. Ratatouille: A flavorful vegetable stew consisting of eggplant, zucchini, bell peppers, tomatoes, onions, and herbs.
6. Bouillabaisse: A traditional Provençal fish stew made with various types of fish, shellfish, vegetables, and aromatic herbs.
7. Foie Gras: A luxury food product made from the liver of a specially fattened duck or goose.
8. Quiche Lorraine: A savory pie filled with a mixture of eggs, cream, cheese, and bacon or ham.
9. CrΓͺpes: Thin pancakes that can be filled with sweet or savory ingredients, such as Nutella, fruits, cheese, or ham.
10. Macarons: Delicate, colorful, and almond-based cookies with a sweet filling, typically made in various flavors.

These are just a few examples, and France has an extensive culinary repertoire with numerous regional specialties.

Summarizing a Large Piece of Text – the Challenge

So now that we made a simple call, let’s look at a real use case for LangChain. We’re going to be using ChatGPT for summarizing a large piece of text.

So, the problem with ChatGPT here is that it has a context limit. This context is the maximum total amount of tokens (each token is basically a couple of text characters) we can send to ChatGPT in a single API call and differs between models. This limits us in the amount of data we can send to ChatGPT in a single request. We can’t just send a whole book and request a summary for instance.

For normal ChatGPT this limit is at 4096 tokens, but there are special GPT-4 8k and 32k context versions out there, and also a 3.5 turbo 16k model.

The problem is still the same, though: if we need to summarize a huge piece of text, eventually, we will run out of context ‘space’ to send all these tokens. For this tutorial series, we will be using GPT Turbo 3.5, so everyone can follow along without needing access to GPT 4.

We’ll also stick to the 4k context model for purposes of explanation, as it will save you some tokens as the learner versus going past the context limit on the 16k model. We could send 80,000 tokens to a model with a 16,000 token limit, or send 6,000 tokens to a model with a 4,000 token limit.

The principle is precisely the same, so we’ll be doing the latter in this tutorial to save you some tokens and make this tutorial extremely cheap to follow in terms of API costs.

So, what do we do if we want a summary of a document larger than our context limit? We cannot just send the whole document and ask for a summary, as we’ll get a token limit error:

⚠️openai.error.InvalidRequestError: This model's maximum context length is 4097 tokens. However, your messages resulted in 4175 tokens. Please reduce the length of the messages.

So what we need to do is split up the document into smaller chunks and send these chunks to ChatGPT one by one. Then, we can take all the summaries and combine them into one big summary.

πŸ‘‰ Example: Say we make 5 calls with 4000 tokens each, splitting the text into 5 parts and asking for a summary each time. We save all 5 small summaries in memory, and then we make a final ChatGPT call number 6, where we send all these summaries back to ChatGPT again and ask for a final summary of all the summaries it has made for us!

Now, while we can program all of this manually in a chain of ChatGPT calls that will handle the sub-summaries, save the results, and send the texts for the final summary, this is a lot of work.

We’d also have to account for edge cases where all the summaries combined are still larger than the context limit, and things get pretty complex.

LangChain is going to make our lives a lot easier here. Defining these kinds of ‘chains’ of LLM calls is exactly what LangChain is good at!

πŸ’‘ LangChain names this strategy Map-reduce as it will first ‘map‘ over all our smaller text chunks individually and then ‘reduce‘ the resulting summaries into a single final summary result.

Let’s Do It!

Create a new file in your '1_Summarizing_long_texts' folder and name it '2_summarize_txt.py'.

We’ll get to the code in here later, but before we do, to keep things organized, let’s create two sub-folders inside the ‘1_Summarizing_long_texts‘ folder. One will be named 'prompt_templates' and one will be named 'utils'.

This will just keep our main code more readable and easier to explain by not cramming everything into one file.

πŸ“Finx_LangChain
    πŸ“1_Summarizing_long_texts
        πŸ“prompt_templates
        πŸ“utils
        πŸ“„1_basic_call.py
        πŸ“„2_summarize_txt.py
    πŸ“„.env

Creating our Prompt Templates

Let’s first get our prompt templates out of the way. Inside the prompt templates folder, make a file called prompts.py. Inside this new prompts.py file, we’ll set up the two prompts we need.

map_prompt = """
The following is a part of a speech
{text_chunk}
Based on this particular part of the speech, please write a concise summary
Helpful Answer:
"""

This prompt will be used for each part of the cut-up text. I’ve changed the prompt to say 'speech' instead of text as I’ll use a long speech for this example. If you want a more generic version, you could use text instead.

So this is the prompt we’ll feed ChatGPT every time we give it a part of the text to summarize. Notice the {text_chunk} in the middle, which is again where we insert our variable, a different piece of text for each call.

Now, let’s add the reduce template below:

reduce_prompt = """The following is set of summaries:
{text_summaries}
Take these and combine them into a final, consolidated summary of the full speech.
Helpful Answer:
"""

This is the prompt we’ll use for the final call to ChatGPT, where we’ll feed it all the summaries it has made for us and ask it to combine them into a single summary.

πŸ‘‰ Notice the {text_summaries} which is where we insert our variable containing the summaries.

Go ahead and close and save this file, then still inside the prompt_templates folder, make a file called __init__.py. Just keep it as an empty file and save it for now. This will make sure Python recognizes this folder as a package.

The prompt_templates folder looks as follows:

πŸ“prompt_templates
    πŸ“„__init__.py
    πŸ“„prompts.py

Creating a Token Counter Utility Function

Now go to the utils folder and create a file called utils.py. This is where we’ll put our utility functions that will help us with our summarization to keep the main code clean and uncluttered.

Inside your utils.py file add the following import and setup:

import tiktoken

ℹ️ Info: Tiktoken is an open-source tokenizer developed by OpenAI that is used for tokenizing text in their LLMs. Without going into too much detail, it basically just creates small groups of characters for the model to understand.

Write the following simple function:

def get_tokens(text: str) -> int:
    """Get the number of tokens in a text by running a tiktoken encoder on it."""
    encoding = tiktoken.encoding_for_model("gpt-3.5-turbo")
    tokens = encoding.encode(text)
    return len(tokens)

The function uses the tiktoken library to load the tokenizer for the "gpt-3.5-turbo" model. The tokenizer can be different for each model, so we need to make sure we use the correct one. The tiktoken.encoding_for_model function is called to retrieve the encoding for this specific model.

Then, the encode() method is used on the obtained encoding object to tokenize the input text. The tokens represent small groups of characters that the model can understand.

Finally, the length of the token list is calculated using the len() function and returned as the result of the get_tokens function.

As you can see, we won’t actually use this to tokenize our text but only to return the number of tokens, openai will take care of tokenizing for us if we make a request. We created this function to get more insight into how many tokens a piece of text is before we potentially try to send it.

Go ahead and save the utils.py file and close it. Then, just like before, create an empty file called __init__.py in the utils folder to make it a package. The utils folder looks as follows:

πŸ“utils
    πŸ“„__init__.py
    πŸ“„utils.py

There is one last piece of preparation we need. A large text to summarize.

I’m going to be using the last speech by Dr. Martin Luther King Jr., you can find it here: https://www.neil.blog/full-speech-transcript/ive-been-to-the-mountaintop-by-dr-martin-luther-king-jr.

You can also use something else if it’s no longer available or you have another piece of text you want to use.

Copy the speech transcript and place it inside a .txt file in a folder named data like so:

πŸ“Finx_LangChain
    πŸ“1_Summarizing_long_texts
        πŸ“data
            πŸ“„speech.txt
        πŸ“prompt_templates
        πŸ“utils
        πŸ“„1_basic_call.py
        πŸ“„2_summarize_txt.py
    πŸ“„.env

Creating our Main File

Let’s get back to our '2_summarize_txt.py' file inside the '1_Summarizing_long_texts' folder. Inside our 2_summarize_txt.py file, first, we’ll import the needed libraries and classes:

from decouple import config
from langchain.chains import (
    LLMChain,
    MapReduceDocumentsChain,
    ReduceDocumentsChain,
    StuffDocumentsChain,
)
from langchain.chat_models import ChatOpenAI
from langchain.docstore.document import Document
from langchain.document_loaders import TextLoader
from langchain.prompts import PromptTemplate
from langchain.text_splitter import CharacterTextSplitter

This may look overwhelming at first but it’s actually pretty simple.

  • Decouple/config is just the .env API key reader we used before.
  • The langchain.chains imports are just LLM API calls with a slight bit of functionality added on to them, you’ll see more details later.
  • ChatOpenAI is just a ChatGPT chat call,
  • Document is just a data type that LangChain uses, which holds a piece of text and some metadata.
  • TextLoader is a tool to load text from a file,
  • PromptTemplate lets us load our templates we defined earlier, and
  • CharacterTextSplitter is a tool to split our text into smaller chunks.

Most of these tools do exactly as the name suggests!

Now, let’s add the imports for our prompt templates and utils:

from prompt_templates.prompts import map_prompt, reduce_prompt
from utils.utils import get_tokens

Notice how the prompt_templates.prompts and utils.utils imports read very awkwardly. Let’s fix that!

Go to your prompt_templates folder and open the "__init__.py" file. Add the following lines to it:

from .prompts import map_prompt, reduce_prompt

Save and close and now in the utils folder open the "__init__.py" file and add the following lines:

from .utils import get_tokens

Save and close. Now go back to your '2_summarize_txt.py' file and change the last two imports to the following:

from prompt_templates import map_prompt, reduce_prompt
from utils import get_tokens

This is much nicer to read!

It works because our import will look at the __init__.py file in the folder, and inside this file, we imported the prompts and utility function, so Python looks in the init files to see where to find these imports, allowing us to use a simpler and more sensible name.

Our imports now look like this:

from decouple import config
from langchain.chains import (
    LLMChain,
    MapReduceDocumentsChain,
    ReduceDocumentsChain,
    StuffDocumentsChain,
)
from langchain.chat_models import ChatOpenAI
from langchain.docstore.document import Document
from langchain.document_loaders import TextLoader
from langchain.prompts import PromptTemplate
from langchain.text_splitter import CharacterTextSplitter
from prompt_templates import map_prompt, reduce_prompt
from utils import get_tokens

First, let’s prepare our model:

model = ChatOpenAI(
    temperature=0,
    model_name="gpt-3.5-turbo",
    openai_api_key=config("OPENAI_API_KEY"),
)

We set the temperature to 0 to make sure we get the same results every time, and we set the model to the turbo 3.5 model. We also set the openai_api_key to our API key from the .env file using the config module.

Loading and Splitting Text into LangChain Documents

Next, we’ll create a quick function to load text in our '2_summarize_txt.py' file:

def load_text(path: str) -> list[Document]:
    text = TextLoader(path).load()
    print(f"Loaded text contains {get_tokens(text[0].page_content)} tokens")
    return text

We create a function called load_text that takes a path as a string and returns a Python list containing Document objects (we’ll look at Document objects in a second).

We use our imported TextLoader, giving it the path to load.

We then use an f-string to print the number of tokens in the text we just loaded, using our get_tokens function we defined in utils and passing in the ‘text‘ variable, which is a list, so we select index 0 which holds all the text in a single Document object.

Finally, we simply return the list of Document objects.

πŸ’‘ If your file is in a particular encoding you might need to add an encoding argument to the TextLoader call, for example: TextLoader(path, encoding="utf8").load()

So what exactly do we get so far? Let’s have a look at this object. Add the following line below the load_text function:

speech: list[Document] = load_text("data/speech.txt")
print(speech)

We run our load_text function, passing in the path to the data/speech.txt we prepared.

We catch this in a variable named speech, which is a list of Document objects, which we then print.

Now go ahead and run this file.

ℹ️ Note: Make sure you cd into your …/Finxter_Work/Finx_LangChain_series1/1_Summarizing_long_texts directory before you run it, so Python can find your text file!

As you can see, our speech is a [] list object with a single langchain Document object inside of it, holding the entire speech as one huge string.

As you can see in your console, the Document data structure is fairly simple and looks like this:

[Document({'page_content': 'a very huge text here...', 'metadata': {'source': 'data/speech.txt'}})]

We can also see the speech is a total of 5351 tokens, courtesy of our get_tokens utility we called. Again, we could just switch to the 16k context model of ChatGPT Turbo 3.5, but then we’d have to use a much larger text to demonstrate the same point.

In order to save your tokens, we’ll be using this 4k context model with a 5.3k token speech, but the same applies to summarizing a 30k token speech using a 16k context model.

The string in our Document object is too long for ChatGPT to handle in one API call, so we need a way to split up our list with a single Document object into a list with multiple Document objects.

Let’s define a second function that takes our list with a Document object as input and returns a split-up list with multiple Document objects as output:

def split_text(text: list[Document]) -> list[Document]:
    text_splitter = CharacterTextSplitter.from_tiktoken_encoder(chunk_size=3000)
    return text_splitter.split_documents(text)

We define split_text, which is a function that takes a list of Documents as input and returns a list of Documents as output. We create a text_splitter variable, which is a LangChain CharacterTextSplitter, which we create using the from_tiktoken_encoder method, which takes a chunk_size as an argument.

The chunk_size is the size of each chunk of text we want to split our text into, which should be below the context limit of the model we’re using.

We then return the result of the text_splitter.split_documents method, passing in our text variable.

Below this, add the following to call the function we just created:

speech_chunks: list[Document] = split_text(speech)

We now have speech_chunks, which is still a list of Documents, but this time we have multiple Documents, each holding a portion of the speech small enough to send to ChatGPT.

It looks roughly like this (you can add a print statement and run the file if you’re curious):

[
    Document(page_content="text chunk 1...", metadata={"source": "data/speech.txt"}),
    Document(page_content="text chunk 2...", metadata={"source": "data/speech.txt"}),
    Document(page_content="text chunk 3...", metadata={"source": "data/speech.txt"}),
]

Setting Up Our LangChain Chains

Now let’s set up our prompt template for the mapping step:

map_prompt = PromptTemplate.from_template(map_prompt)
single_map_call = LLMChain(llm=model, prompt=map_prompt)

LangChain uses the PromptTemplate class to load/define prompts.

We simply feed it the map_prompt we defined earlier, which asks for a summary of a single text chunk.

We then create a chain using the LLMChain class, passing in the ChatGPT model and the prompt we just defined. This single_map_call now is a LangChain chain that, when invoked, will take a single piece of text, feed it into the template, and then make a ChatGPT call to summarize it.

That’s our map step all set up!

Now for the trickier part, the reduce chain, which has a couple of more layers to it. In the reduce chain, we will have a whole bunch of summaries, each of a part of the text, and we’ll have to combine all these summaries, reducing them to a single summary.

reduce_prompt = PromptTemplate.from_template(reduce_prompt)
single_reduce_call = LLMChain(llm=model, prompt=reduce_prompt)

We start out exactly the same, defining our PromptTemplate and creating a chain using the LLMChain class, that when invoked will do a single call using our prompt template which asks for a single summary of the summaries provided.

The problem is that we’ll have a whole bunch of partial summary documents we need to ‘stuff’ into the prompt template at once.

This is where the StuffDocumentsChain comes in. The StuffDocumentsChain is a chain that combines documents by stuffing into context. It will take smaller documents and combine them back into one bigger document.

We sort of wrap our single_reduce_call inside a stuff chain by continuing our code below like this:

single_stuff_and_reduce_call = StuffDocumentsChain(
    llm_chain=single_reduce_call, document_variable_name="text_summaries"
)

This is still just our single_reduce_call but langchain will stuff the small separate summaries into one prompt for us. Note the document_variable_name argument.

If you check the reduce_prompt template we wrote in the prompt_templates folder you’ll see we have {text_summaries} in there. This is the name of the variable we’re telling StuffDocumentsChain to use as this is the gap in the prompt template where the documents must be stuffed into.

We now have a chain that will stuff our documents together and then make a call using the single_reduce_call to ask for a summary of all the summaries provided.

But we’re not quite done yet… We need to account for a potential problem here.

What if our text is so large that it got cut into a very large amount of chunks? All these summaries add up, and the summaries together may be too large to send to ChatGPT in a single call.

What we need is a chain that will send the summaries in groups that do not exceed the token limit. If the summaries can be sent in a single group, then fine, we’ll be done after a single call.

But if there are too many summaries they should be sent in groups, summarizing each group in turn, and then finally combining the summaries of the summaries with a final call.

Luckily LangChain can help us out here as well!

send_groups_to_single_stuff_and_reduce_call = ReduceDocumentsChain(
    combine_documents_chain=single_stuff_and_reduce_call,
    token_max=3500,
)

The ReduceDocumentsChain does pretty much exactly what we just discussed.

We feed it our single_stuff_and_reduce call to call for each group of summaries, capping each group at a maximum of 3500 tokens, safely below the context token limit of the ChatGPT version we are using.

If there are multiple groups of summaries, ReduceDocumentsChain will take care of recursively calling our provided chain until all the summaries and summaries of summaries are combined into a single summary.

Note the variable name send_groups_to_single_stuff_and_reduce_call is ridiculously long, but it describes exactly what the chain does, so it’s still better for understanding, especially during this tutorial.

Combining Our Chains into a Single Chain

So now we have a map chain, which will ask for a summary of a single chunk of the whole text, and a chain that will feed groups of summaries and reduce them until there is only a single summary left. All we need to do now is combine these two ‘branches’ into a single powerful entity.

You guessed it, LangChain to the rescue again! We can use the MapReduceDocumentsChain:

map_reduce_chain = MapReduceDocumentsChain(
    llm_chain=single_map_call,
    reduce_documents_chain=send_groups_to_single_stuff_and_reduce_call,
    document_variable_name="text_chunk",
    return_intermediate_steps=True,
)

We first pass our single_map_call as llm_chain, as the map function it can call over and over on each chunk of text to get a summary.

Second, is the reduce_documents_chain that will be used once we have a whole list of summaries to reduce.

The document_variable_name is the {text_chunk} hole we left in our map_prompt template where our single_map_call will put the partial text on each summary request.

Finally, return_intermediate_steps will, as the name suggests, allow us to see the intermediate steps and summaries. We’ll set this to True as this will aid our understanding for learning purposes.

As the output will be pretty large, I’ll be printing it not to the console but to a file instead. At the bottom, add:

with open("data/test_output.py", "w") as f:
    print(map_reduce_chain.invoke(speech_chunks), file=f)

This will create a file called test_output.py in your data folder in write mode ("w"), and give it the alias 'f'.

We then print, but pass it the file reference 'f' as the second argument, to write to the file instead of the console. The call in the print statement uses the map_reduce_chain‘s .invoke method and we simply pass the speech_chunks, which is the list of Document objects we prepared beforehand.

Go ahead and run this file (it will take a moment based on the number of calls that need to be made and the length of your total text).

Then browse to your data/test_output.py file and because we set intermediate steps to True you should see something like this:

{
    "input_documents": [
        Document(
            page_content="Long text chunk 1...",
            metadata={"source": "data/speech.txt"},
        ),
        Document(
            page_content="Long text chunk 2...",
            metadata={"source": "data/speech.txt"},
        ),
        Document(
            page_content="Long text chunk 3...",
            metadata={"source": "data/speech.txt"},
        ),
    ],
    "intermediate_steps": [
        "Summary of text chunk 1...",
        "Summary of text chunk 2...",
        "Summary of text chunk 3...",
    ],
    "output_text": "Final summary of all the summaries...",
}

As the whole might be a bit confusing, I’ll go over what we just did step by step one more time. If you’ve already fully understood it, feel free to skip ahead a bit.

  1. We load a speech using load_text() and get a list with a single Document object inside of it.
  2. We split that Document object into a list of Document objects using split_text() and get a list with 3 Document objects inside of it.
  3. We feed this list of separate chunks to our map_reduce_chain MapReduceDocumentsChain, which will start with the mapping step, calling our single map call for every chunk of text.
  4. The single map call will simply ask for a summary for a single chunk of text each time, mapping what was a list of text chunks into a list of summaries of those chunks.
  5. After we have a whole bunch of small documents containing summaries of the chunks, our map_reduce_chain MapReduceDocumentsChain will feed into the next step, the reducing.
  6. This will first feed into our send_groups_to_single_stuff_and_reduce_call ReduceDocumentsChain, which will group the summaries into groups that do not exceed 3500 tokens, so we don’t exceed the token limit and blow up.
  7. For each of these groups, it will call our single_stuff_and_reduce_call StuffDocumentsChain, which will stuff this group of summaries into a single prompt and call our single_reduce_call LLMChain, which will reduce the summaries into a single summary.
  8. We keep going until there is only one summary left, which is our final summary of the entire speech.

Now, this run is fairly simple as the text wasn’t that long, only being cut into three parts, but I’ve tried this on a 150-page book, and while it took a while to run it worked without a hitch and returned a fantastic summary!

Note that the reason this is all so useful in the first place is that we could not call ChatGPT with this huge text as a query at all, it would instantly crash.

Using LangChain we pre-process our huge text and feed it in bite-sized pieces, keep track of the whole pie, and then combine it all back together.

That’s it for the first part. It was a bit of a tough one to get us warmed up, but we have loads of really, really cool stuff coming in the next tutorial parts, so get excited! In the next part, we’ll be looking at using ChatGPT and LangChain to chat with an entire book of our choice and ask it questions, which is mind-blowing!

This tutorial is part of our original course on Python LangChain. You can find the course URL here: πŸ‘‡

πŸ§‘β€πŸ’» Original Course Link: Becoming a Langchain Prompt Engineer with Python – and Build Cool Stuff πŸ¦œπŸ”—

Original article on the Finxter Academy