Python LangChain Course 🐍🦜🔗 Custom Tools (4/6)

Python LangChain Course 🐍🦜🔗

✅ Part 0/6: Overview
✅ Part 1/6: Summarizing Long Texts Using LangChain
✅ Part 2/6: Chatting with Large Documents
✅ Part 3/6: Agents and Tools
👉✅ Part 4/6: Custom Tools
✅ Part 5/6: Understanding Agents and Building Your Own
✅ Part 6/6: RCI and LangChain Expression Language

Welcome back to part 4. In this part, we’ll look at giving our agent access to the entire internet. Not just Google or DuckDuckGo or Wikipedia, but the ability to open and read any web address we feed it. Having looked at tools in the previous part, in this part, we’ll be looking at making our own custom tools. Once again, this is fairly similar to function calls, which makes perfect sense as the ability for an LLM to call some kind of function to obtain additional information is simply a very powerful prospect that will only increase in use in the future.

Creating a simple tool

Before we get our AI surfing the internet though let’s create a basic tool to warm up a little bit. We’ll create a simple tool that can get the price for a Stock Ticker Symbol. For example, if we wanted to get the price for Apple we would use the symbol AAPL. We’ll use the Yahoo Finance API to get the price for a given symbol.

Before we get started run the following in your terminal:

pip install yfinance

Now start a new folder named ‘4_Custom_tools‘ and inside add a new file named ‘1_stock_price_tool.py‘ like this:

📁Finx_LangChain
    📁1_Summarizing_long_texts
    📁2_Chat_with_large_documents
    📁3_Agents_and_tools
    📁4_Custom_tools
            📄1_stock_price_tool.py
    📄.env

Inside the ‘1_stock_price_tool.py‘ file we start with our imports:

import yfinance as yf
from decouple import config
from langchain.agents import initialize_agent
from langchain.agents.agent_types import AgentType
from langchain.chat_models import ChatOpenAI
from langchain.tools import BaseTool

We import yfinance to get the stock prices, config for our API key, and initialize_agent and AgentType which we both used before. We use the ChatOpenAI endpoint and import the BaseTool class. This BaseTool class is the base for all LangChain tools to inherit from, including the tools we used in the previous part. This time we’ll build our own tools based on this BaseTool class.

If we look in the source code for the BaseTool class, we’ll see that there is an @abstractmethod named ._run(). This means that the child class is supposed to implement this method. So if we create a tool called 'Hammer' then we need to create a method Hammer._run() which will run the tool. Why do we have to give our Tools a ._run() method with an underscore? The ._run() method is implementation detail that we provide. The BaseTool class also has a .run() method which we should not overwrite, that will do some internal checks and then call our ._run() implementation.

If that all seems a bit confusing don’t worry about it. Just know that we implement Tool._run() to provide the functionality but we call Tool.run() to actually run the tool. You’ll see how this works below.

Let’s set up our API first (make sure to use the 0613 function-call-specific model):

chat_gpt_api = ChatOpenAI(
    model="gpt-3.5-turbo-0613",
    openai_api_key=config("OPENAI_API_KEY"),  # type: ignore
    temperature=0,
)

Now let’s get started on our StockPriceTool:

class StockPriceTool(BaseTool):
    name: str = "yahoo_finance"
    description: str = "useful when you need to answer questions about the current stock price of a stock ticker"

So we create a new class and have it inherit from the BaseTool class. We then set the name and description on the class level, so whoever uses this tool can use these defaults if they want. Make sure to stay indented inside the class for the following blocks.

    def _run(self, query: str) -> str:
        ticker = yf.Ticker(query)
        print(ticker)
        return f"{query} - ${ticker.info.get('currentPrice')}"

Now we implement the ._run() method. It takes a query as a string and the return is also a string. We call the .Ticker method on Yahoo finance passing in the query. We return a string that shows the ticker and the price by calling the .get() method on the .info attribute of the ticker.

    def _arun(self, query: str):
        raise NotImplementedError("This tool does not support asynchronous execution")

There is also room to implement an async version but we’re not going to be using this for now so we just raise a NotImplementedError.

Your StockPriceTool class now looks like this:

class StockPriceTool(BaseTool):
    name: str = "yahoo_finance"
    description: str = "useful when you need to answer questions about the current stock price of a stock ticker"

    def _run(self, query: str) -> str:
        ticker = yf.Ticker(query)
        print(ticker)
        return f"{query} - ${ticker.info.get('currentPrice')}"

    def _arun(self, query: str):
        raise NotImplementedError("This tool does not support asynchronous execution")

Creating an agent with our tool

This is the bare basics of what we need to create a tool. We can now use this tool in our agent. Let’s create a new agent and add our tool to it:

agent = initialize_agent(
    llm=chat_gpt_api,
    agent=AgentType.OPENAI_FUNCTIONS,
    tools=[StockPriceTool()],
    verbose=True,
)

We initialize our agent with the ChatOpenAI endpoint, the AgentType.OPENAI_FUNCTIONS and we pass in a new instance of our StockPriceTool by calling it. We also set verbose to True so we can see what’s going on.

Let’s test out our very first basic tool…

agent.run("What is the current price of AAPL?")

And we get the following output:

> Entering new AgentExecutor chain...

Invoking: `yahoo_finance` with `AAPL`

yfinance.Ticker object <AAPL>
AAPL - $171.21

The current price of AAPL (Apple Inc.) is $171.21.

> Finished chain.

Let’s just make sure our agent can also still answer normal questions:

agent.run("Please tell me what a pineapple is?")

Yep, works fine, you can see it just instantly answers as there is no need to call any function.

> Entering new AgentExecutor chain...
A pineapple is a tropical fruit that is native to South America. It has a tough, spiky outer skin and a sweet, juicy flesh inside. Pineapples are known for their distinctively sweet and tangy flavor. They are often used in cooking, baking, and as a topping for desserts. Pineapples are also a good source of vitamins and minerals, such as vitamin C and manganese.

> Finished chain.

Creating a more advanced tool

Okay, now that we have seen how creating and using a basic tool works, let’s create something that’s actually cool and useful. It’s time to give our agent the ability to browse any page on the internet. We can then ask questions about this information or use the information to answer other questions, get a summary, ask what the page is about, etc.

Go ahead and save and close the '1_stock_price_tool.py' file and create a new folder called tools. Now that we know how to build a tool, we’ll write our internet tool first and in a separate folder to keep things organized. In the tools folder create a file called internet_tool like so:

📁Finx_LangChain
    📁1_Summarizing_long_texts
    📁2_Chat_with_large_documents
    📁3_Agents_and_tools
    📁4_Custom_tools
            📄1_stock_price_tool.py
            📁tools
                📄internet_tool.py
    📄.env

Inside the 'internet_tool.py' we’ll write our tool, so start with our imports:

import requests
from bs4 import BeautifulSoup
from langchain.tools import BaseTool

And in a terminal window run the following install command:

pip install beautifulsoup4

The Python built-in requests library will allow us to send HTTP requests over the internet to, for instance, GET a webpage. Beautiful Soup is a Python library commonly used for web scraping purposes. It provides a convenient way to parse and navigate through HTML documents, making it easier to extract data from web pages. We’ll use the BeautifulSoup library to parse the HTML we get back from the webpage and extract the text from it. We import the BaseTool class again as this is our base class to build our tool on top of.

Let’s start building our tool:

class InternetTool(BaseTool):
    name: str = "internet_tool"
    description: str = (
        "useful when you want to read the text on any url on the internet."
    )

Again, we create our InternetTool class and inherit the functionality from BaseTool, then set the name and description which are both just simple strings.

Getting the text content of a webpage

Now let’s define the first method of our class (Make sure to stay indented inside the class for the following blocks):

    def get_text_content(self, url: str) -> str:
        """Get the text content of a webpage with HTML tags removed"""
        response = requests.get(url)
        html_content = response.text

        soup = BeautifulSoup(html_content, "html.parser")
        for tag in ["nav", "footer", "aside", "script", "style", "img"]:
            for match in soup.find_all(tag):
                match.decompose()

        text_content = soup.get_text()
        text_content = " ".join(text_content.split())
        return text_content

We define our method get_text_content, which takes self (the current instance of the class) and a URL as a string. We then use the requests library and call its .get() method passing in the URL, which will return the HTML of the webpage in our response variable. We then get the html_content from the response.text attribute.

After this we create a new BeautifulSoup object by calling BeautifulSoup and passing in the HTML content, specifying the HTML parser as the second argument because we want to parse HTML. The BeautifulSoup object stored in our ‘soup‘ variable is basically an object that represents the structure of our HTML page. More importantly, this is an object that we can interact with and easily change, filter, and manipulate.

We then call a loop on a list containing HTML tag names. For each tag in the list ["nav", "footer", "aside", "script", "style", "img"] this loop will run. Within each loop, we call BeautifulSoup's .find_all() method passing in the tag, let’s take the “nav” tag as an example for our explanation. So it will find all the “nav” tags in our HTML and return them. We then loop over these matches and call .decompose() on each match. As the name suggests, .decompose() will remove a tag. So we loop over the whole list of tags, and for each tag we then loop over all the matches in the inner loop, removing all those tags. We do this because we want to just keep the main text of the page and not have to send loads of header, footer, and HTML tags to ChatGPT later on.

Now we call soup.get_text() which is BeautifulSoup’s method to just get the text from the page. Wait a minute, so if there is a method to just get the text why didn’t we call this earlier and not do the whole loopy thing? Because by looping over the tags we removed not just the tags themselves but also all the text inside them. If we had just called the .get_text() method right away we’d still have all the header and nav and footer etc text in there polluting the output.

We then call text_content.split() which will turn the whole text into a list ["with", "each", "word", "as", "a", "separate", "entry."]. We do this because it gets rid of the extra Tab, Newline (Enter), and multiple space characters present in the text all in one go. We then call " ".join() on the list which will join all the words back together with a single space between them, basically returning the text back to a normal string but without all the extra characters. Finally, we return this string containing only the main text of the page without any extra tags, headers, footers, navigation, or extra tabs and enter characters.

Now let’s add another method still inside our InternetTool class, this one is very easy:

    def limit_chars(self, text: str) -> str:
        """limit number of output characters"""
        return text[:10_000]

It just takes a text string as input and returns the first 10.000 characters using a slice from index 0 to index 10.000. Note you are allowed to add underscores in numbers like 10_000 or 45_213_123 to make them more readable. Python will treat them as if the underscores are not there.

Implementing the _run() method

Now let’s define our ._run() method, or else we won’t have a valid tool!

    def _run(self, url: str) -> str:
        try:
            text_content = self.get_text_content(url)
            return self.limit_chars(text_content)
        except Exception as e:
            return f"The following error occurred while trying to fetch the {url}: {e}"

The ._run() method takes url as a string and returns a string. It will try to get the text content of the URL by calling the .get_text_content() method we defined earlier. It will then return the first 10,000 characters. If an error occurs it will return a string saying what the error was.

Now the ._run() method takes a URL as input argument, but where does it come from? We will be calling tool.run() and not tool._run() so we cannot pass this argument in directly! LangChain will simply pass any arguments we pass to tool.run() to tool._run() for us. So if we call tool.run("https://www.google.com") then LangChain will do it’s checks and call tool._run("https://www.google.com") for us. This is why we need to define the ._run() method with the url argument.

Let’s add the ._arun() method just to be completist:

    def _arun(self, url: str):
        raise NotImplementedError("This tool does not support asynchronous execution")

Your completed tool now looks like this:

class InternetTool(BaseTool):
    name: str = "internet_tool"
    description: str = (
        "useful when you want to read the text on any url on the internet."
    )

    def get_text_content(self, url: str) -> str:
        """Get the text content of a webpage with HTML tags removed"""
        response = requests.get(url)
        html_content = response.text

        soup = BeautifulSoup(html_content, "html.parser")
        for tag in ["nav", "footer", "aside", "script", "style", "img"]:
            for match in soup.find_all(tag):
                match.decompose()

        text_content = soup.get_text()
        text_content = " ".join(text_content.split())
        return text_content

    def limit_chars(self, text: str) -> str:
        """limit number of output characters"""
        return text[:10_000]

    def _run(self, url: str) -> str:
        try:
            text_content = self.get_text_content(url)
            return self.limit_chars(text_content)
        except Exception as e:
            return f"The following error occurred while trying to fetch the {url}: {e}"

    def _arun(self, url: str):
        raise NotImplementedError("This tool does not support asynchronous execution")

Testing our tool

Now below our class and outside of it we can add a little test to make sure our tool works:

if __name__ == "__main__":
    tool = InternetTool()
    print(
        tool.run("https://en.wikipedia.org/wiki/List_of_Italian_desserts_and_pastries")
    )

This __name__ argument will be set to "__main__" by Python if we run this file directly, but when we import it into another file it will be set to the name of the file. So if we import this file into another file and run it, the if statement will not run. This is useful because we can now just run this test by running the file directly, but if we import it into another file we can use the InternetTool class without worrying about the test running needlessly.

So go ahead and run this file, and you should see a large list of Italian desserts and pastries printed to your terminal window. Our internet tool is working! Before we leave this tools folder go ahead and create an __init_.py file inside the ‘tools’ folder to make our import statement cleaner like we did in the previous part. Inside the __init_.py file add the following:

from .internet_tool import InternetTool

Creating an agent with our internet tool

Go ahead and save and close this file and then create a new file called '2_internet_tool_agent.py' in the '4_Custom_tools' folder like so:

📁Finx_LangChain
    📁1_Summarizing_long_texts
    📁2_Chat_with_large_documents
    📁3_Agents_and_tools
    📁4_Custom_tools
            📄1_stock_price_tool.py
            📄2_internet_tool_agent.py
            📁tools
                📄__init__.py
                📄internet_tool.py
    📄.env

Inside our '2_internet_tool_agent.py' start with our imports as always:

from decouple import config
from langchain.agents import initialize_agent
from langchain.agents.agent_types import AgentType
from langchain.chat_models import ChatOpenAI
from tools import InternetTool

These are all imports you are familiar with plus our own InternetTool we just built. Note that because of the import in the .__init__.py file we can now just the InternetTool directly from the tools folder instead of having to type from tools.internet_tool import InternetTool.

Set up our chat_gpt_api and initialize our agent:

chat_gpt_api = ChatOpenAI(
    model="gpt-3.5-turbo-0613",
    openai_api_key=config("OPENAI_API_KEY"),
    temperature=0,
)

agent = initialize_agent(
    llm=chat_gpt_api,
    agent=AgentType.OPENAI_FUNCTIONS,
    tools=[InternetTool()],
    verbose=True,
)

This is all familiar terrain by now, we use the OpenAI functions agent type for max reliability and we pass in our list of tools, which in this case is just a new instance of InternetTool created by calling it. Remember to use the specific gpt-3.5-turbo-0613 model as it is the function calls version.

All this is really the same as we did in the “OpenAI function calls and embeddings” tutorial series. Under the hood we’re just doing OpenAI function calls except with a simplified interface. We don’t have to worry about the message history, the possibility of multiple calls and looping until we have a response, writing a custom function to get readable and colorized terminal output of the llm models reasoning steps that led to the output, the exact function description Json object, etc.

💡 Note: You can watch the full course video right here on the blog — I’ll embedd the video below each of the other parts as well. If you want the step-by-step course with code and downloadable PDF course certificate to show your employer or freelancing clients. follow this link to learn more.

Let’s give our agent a good test

Now for the test. Say I have this article and I need a small summary of it:

agent.run(
    "Please give me a summary of https://racingnews365.com/horner-verstappen-happy-despite-end-to-f1-win-streak"
)

And you should get a short summary of what the article is all about. Let’s try a page with a very large amount of text on it:

agent.run(
    "Please give me a summary of https://drewisdope.com/chatgpt-limits-words-characters-tokens/"
)

And while we still get a nice summary of the article, it didn’t fetch all the text as it was too long. If you’d want to optimize this for the specific purpose of summarizing you’d have to use a higher context model or combine this with the summarization concepts we learned in part 1 of this tutorial series.

Note that our agent is not limited to running our function and can still answer normal questions as well, without calling any functions:

agent.run("16 Pineapples -3 Pineapples = ?")

And we get the following output:

> Entering new AgentExecutor chain...
16 pineapples minus 3 pineapples equals 13 pineapples.

> Finished chain.

Now that we have an internet tool that can read any page on the internet and we have learned how to build our own tools, let’s take a deeper dive into agents and see what makes them tick. We’ll also be using our internet tool again in the next part so you didn’t build it for nothing!

This tutorial is part of our original course on Python LangChain. You can find the course URL here: 👇

🧑‍💻 Original Course Link: Becoming a Langchain Prompt Engineer with Python – and Build Cool Stuff 🦜🔗

Original article on the Finxter Academy