<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Dirk van Meerveld, Author at Be on the Right Side of Change</title>
	<atom:link href="https://blog.finxter.com/author/dirkvanmeerveld/feed/" rel="self" type="application/rss+xml" />
	<link>https://blog.finxter.com/author/dirkvanmeerveld/</link>
	<description></description>
	<lastBuildDate>Fri, 19 Apr 2024 10:27:53 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=6.9.4</generator>

<image>
	<url>https://blog.finxter.com/wp-content/uploads/2020/08/cropped-cropped-finxter_nobackground-32x32.png</url>
	<title>Dirk van Meerveld, Author at Be on the Right Side of Change</title>
	<link>https://blog.finxter.com/author/dirkvanmeerveld/</link>
	<width>32</width>
	<height>32</height>
</image> 
	<item>
		<title>Building Complex Multi-Agent Teams and Setups with LangGraph</title>
		<link>https://blog.finxter.com/building-complex-multi-agent-teams-and-setups-with-langgraph/</link>
		
		<dc:creator><![CDATA[Dirk van Meerveld]]></dc:creator>
		<pubDate>Thu, 18 Apr 2024 20:28:42 +0000</pubDate>
				<category><![CDATA[Artificial Intelligence]]></category>
		<category><![CDATA[Langchain]]></category>
		<category><![CDATA[Large Language Model (LLM)]]></category>
		<category><![CDATA[Python]]></category>
		<guid isPermaLink="false">https://blog.finxter.com/?p=1670085</guid>

					<description><![CDATA[<p>💡 Info: This course is a complete text tutorial. It&#8217;s based on our academy course. If you&#8217;re interested in video explainers, check out the course here. Hi and welcome to this course on building complex multi-agent teams and setups using LangGraph, LangChain, and LangSmith. In this course we&#8217;ll start from the ground up using LangChain, ... <a title="Building Complex Multi-Agent Teams and Setups with LangGraph" class="read-more" href="https://blog.finxter.com/building-complex-multi-agent-teams-and-setups-with-langgraph/" aria-label="Read more about Building Complex Multi-Agent Teams and Setups with LangGraph">Read more</a></p>
<p>The post <a href="https://blog.finxter.com/building-complex-multi-agent-teams-and-setups-with-langgraph/">Building Complex Multi-Agent Teams and Setups with LangGraph</a> appeared first on <a href="https://blog.finxter.com">Be on the Right Side of Change</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<p class="has-global-color-8-background-color has-background"><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4a1.png" alt="💡" class="wp-smiley" style="height: 1em; max-height: 1em;" /> <strong>Info</strong>: This course is a complete text tutorial. It&#8217;s based on our academy course. If you&#8217;re interested in video explainers, <a href="https://academy.finxter.com/university/multi-agent-systems-langgraph/">check out the course here</a>.</p>



<p>Hi and welcome to this course on building complex multi-agent teams and setups using LangGraph, LangChain, and LangSmith. In this course we&#8217;ll start from the ground up using LangChain, and then build and build, adding more complexity and tools as we go along. We will learn how to build a graph with paths, conditional paths, teams, team managers, and more, all stringing our agents together in powerful ways.</p>



<ul class="wp-block-list">
<li>In <strong>part 1</strong>, we&#8217;ll get started with the basics of LangChain, learning how to create prompt templates and Chains, working with the LangChain syntax to easily string together our LLM calls.</li>



<li>In the <strong>next part</strong> we&#8217;ll learn how to write tools so that we can make our future agents powerful by giving them functions they can call. We will use the newest LangChain syntax for this and create both an image generation tool and a weather tool.</li>



<li><strong>Part 3</strong> is where we will learn the basics of LangGraph, covering the underlying concepts and exactly how it works. We will learn by setting up our first agent and graph which can return a visual representation of the current weather in any city you name.</li>



<li>In <strong>part 4</strong> we&#8217;ll look at how we can take this all to yet the next level, discussing how we can use all of this to create a whole team of agents working together for us. We&#8217;ll also write a tool that can output PDF files in preparation for our multi-agent setup.</li>



<li><strong>Part 5</strong> is where the rubber really hits the road and we will create a powerful multi-agent setup in LangGraph using a team, team manager, many agents, conditional paths, and more. We will create a team that can work together independently and create travel itineraries for us, providing them in PDF format with an inserted image and a full travel plan.</li>



<li>In the <strong>final part</strong> we&#8217;ll have a look at writing asynchronous tools for our agents and then create a web research and article writing graph that can visit many web pages at the same time and then write an article about our desired topic for us.</li>
</ul>



<p>I hope you&#8217;re as excited as I am to get started. Let&#8217;s dive in!</p>



<h2 class="wp-block-heading">LangChain, LangSmith and LangGraph</h2>



<p>Hi and welcome to this course on LangGraph, LangChain, and LangSmith. My name is Dirk van Meerveld and I will be your host and guide as we go on this exploration together.</p>



<p>So what is up with all these Lang-words? Well, in short:</p>



<ul class="wp-block-list">
<li><strong>LangChain</strong> is a basic framework that will allow us to work with LLMs.</li>



<li><strong>LangGraph</strong> will allow us to make more complex combinations using LangChain by introducing graph structures, where we can have multiple nodes or even teams of LLM agents working together.</li>



<li><strong>LangSmith</strong> is a tool that helps us see exactly what is going on while we work with the above two, to help us debug and improve our code in a more convenient way.</li>
</ul>



<h3 class="wp-block-heading">LangChain</h3>



<p>Let&#8217;s get started with LangChain<img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f517.png" alt="🔗" class="wp-smiley" style="height: 1em; max-height: 1em;" /> first. Langchain is a framework designed to make it easier to build applications that use large language models (LLMs). Think of it as a set of tools that helps bridge the gap between LLMs and the applications you might want to build with them.</p>



<p>LangChain helps us:</p>



<ul class="wp-block-list">
<li><strong>Provide a unified interface:</strong> Any code you write can be used with different LLMs with little modification, and you can use the same code to write prompts or tools for different LLMs.</li>



<li><strong>Prebuilt tools for common tasks:</strong> Langchain includes tools for common tasks you might want to do with LLMs, such as building chatbots, summarizing documents, or analyzing code. Besides just building our own tools and functions, we can also import community pre-built tools.</li>



<li><strong>Memory and Context:</strong> Langchain makes it easy to incorporate memory and context into our LLM applications. This means our application can remember past interactions and use that information to inform future responses.</li>
</ul>



<p>So let&#8217;s get started! First go ahead and create a new project folder and name it whatever you like, I&#8217;ll call mine <code>FINX_LANGGRAPH</code>:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c2.png" alt="📂" class="wp-smiley" style="height: 1em; max-height: 1em;" /> FINX_LANGGRAPH</pre>



<h3 class="wp-block-heading">Create a venv in the root project folder</h3>



<p>We&#8217;ll be running this project inside a virtual environment. A virtual environment is a self-contained directory that will allow us to install specific versions of packages inside the virtual environment without affecting the global Python installation.</p>



<p>We will use this as I will be using specific versions for the libraries we install as we go along, and I want to make sure that you have the exact same experience as I do.</p>



<p>For example, when we use <code>pydantic</code> we&#8217;ll be using the older V1 for this project, as it plays nicely with LangChain. You&#8217;ll probably have V2 installed on your system-wide Python installation, and then your imports will be different from mine, causing confusion. We also don&#8217;t want to mess with your system-wide Python installation.</p>



<p>The virtual environment will make it easy for you to install my exact versions without worrying about affecting any of your other projects and is a good practice to follow in general.</p>



<p>To create a new virtual environment we&#8217;ll use a tool called <code>pipenv</code>. If you don&#8217;t have <code>pipenv</code> installed, you can install it using pip, which is Python&#8217;s package manager. Run the following command in your terminal:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">pip install pipenv</pre>



<p>Make sure the terminal is inside your root project folder, e.g. <code>/c/Coding_Vault/Finx_Fine_Tuning</code>, and then run the following command to create a new virtual environment:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">pipenv shell</pre>



<p>This will create a new virtual environment and also a <code>Pipfile</code> in your project directory. Any packages you install using <code>pipenv install</code> will be added to the <code>Pipfile</code>.</p>



<ol class="wp-block-list" start="3">
<li>To generate a <code>Pipfile.lock</code>, which is used to produce deterministic builds, run:</li>
</ol>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">pipenv lock</pre>



<p>This will create a <code>Pipfile.lock</code> in your project directory, which contains the exact version of each dependency to ensure that future installs are able to replicate the same environment.</p>



<p>We don&#8217;t need to install a library first to create a <code>Pipfile.lock</code>. From now on when we install a library in this virtual environment with <code>pipenv install library_name</code>, they will be added to the <code>Pipfile</code> and <code>Pipfile.lock</code>, which are basically just text files keeping track of our exact project dependencies.</p>



<p>For reference, I&#8217;m using Python 3.10 for this project, but you should be fine with any recent version. Consider upgrading if you&#8217;re using an older version.</p>



<h3 class="wp-block-heading">Basic project setup</h3>



<p>Before we get started, we need to make sure we have our OpenAI API key ready to load in a convenient way, we cannot hardcode this one in our source code. Go to https://platform.openai.com/api-keys and copy your API key, or make a new one. You&#8217;ll only pay for what you use which will be cents if you just play around with it casually. Then create a new file called <code>.env</code> in the root folder of your project:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c2.png" alt="📂" class="wp-smiley" style="height: 1em; max-height: 1em;" /> FINX_LANGGRAPH
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> .env             <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/2728.png" alt="✨" class="wp-smiley" style="height: 1em; max-height: 1em;" />New file
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> Pipfile
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> Pipfile.lock</pre>



<p>And paste your API key in the <code>.env</code> file like this, making sure not to use any spaces or quotes:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">OPENAI_API_KEY=your_api_key_here</pre>



<p>Then go ahead and save and close this file. If you are using <code>Git</code>, make sure to add this file to your <code>.gitignore</code> file so you don&#8217;t accidentally commit your API key to your repository. If you&#8217;re not using <code>Git</code>, just make sure you exclude the <code>.env</code> file if you share your code with anyone.</p>



<p>We&#8217;ll be using several API keys and settings across our project, adding more as we go, so let&#8217;s create a simple and reusable way to load them to stop us from writing the same code over and over again.</p>



<p>Run the following command in your terminal to add the <code>python-decouple</code> package inside your <code>pipenv</code> environment:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">pipenv install python-decouple==3.7</pre>



<p>We will use this package to read the <code>.env</code> file and get the API key from it. Now create a new file named <code>setup_environment.py</code> in the root folder of your project:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c2.png" alt="📂" class="wp-smiley" style="height: 1em; max-height: 1em;" /> FINX_LANGGRAPH
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> .env
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> Pipfile
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> Pipfile.lock
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> setup_environment.py  <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/2728.png" alt="✨" class="wp-smiley" style="height: 1em; max-height: 1em;" />New file</pre>



<p>Then inside this new <code>setup_environment.py</code> file, write the following code:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">import os

from decouple import config


def set_environment_variables() -> None:
    os.environ["OPENAI_API_KEY"] = str(config("OPENAI_API_KEY"))</pre>



<p>We import the <code>os</code> and <code>config</code> from the <code>decouple</code> package we just installed a minute ago. We then create a function we can import from our other code files.</p>



<p>The <code>config("OPENAI_API_KEY")</code> function reads the <code>.env</code> file and gets the value of the <code>OPENAI_API_KEY</code> variable we set in there, so make sure you have used the exact same name in there. The <code>str()</code> cast just makes sure it&#8217;s a string value. We then set this value to the <code>OPENAI_API_KEY</code> environment variable using <code>os.environ</code>.</p>



<p>This way we can just use LangChain freely without having to worry about our API key as both LangChain and OpenAI are set up to read our API keys from the environment variables automatically.</p>



<h3 class="wp-block-heading">LangChain basics</h3>



<p>Ok, time to get started with LangChain! Let&#8217;s cover the basics first so we understand the building blocks. We&#8217;ll start with some installs. Make sure you run all of these even if you have some of these libraries installed already as we&#8217;re not using the global Python installation but our virtual environment. Run the following command in your terminal:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">pipenv install openai==1.14.2 langchain==0.1.13 langchain-openai==0.1.0</pre>



<p>The <code>openai</code> library will work with the OpenAI API behind the scenes while we use <code>langchain</code> and the <code>langchain-openai</code> library has some functionality that overlaps both.</p>



<p>Now create a new file named <code>langchain_basics.py</code> in the root folder of your project:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c2.png" alt="📂" class="wp-smiley" style="height: 1em; max-height: 1em;" /> FINX_LANGGRAPH
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> .env
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> langchain_basics.py  <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/2728.png" alt="✨" class="wp-smiley" style="height: 1em; max-height: 1em;" />New file
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> Pipfile
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> Pipfile.lock
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> setup_environment.py</pre>



<p>Inside this new <code>langchain_basics.py</code> file, let&#8217;s get started with the following imports:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_openai import ChatOpenAI
from setup_environment import set_environment_variables</pre>



<p>Before we explain the imports, I want to cover a potential problem you may have here. You may have the following problem where the imports are not recognized and have red squiggly lines under them even though you just installed these libraries:</p>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img decoding="async" src="https://academy.finxter.com/wp-content/uploads/2024/04/Screenshot-2024-03-24-100814.jpg" alt="" class="wp-image-4343"/></figure>
</div>


<p>So what is going on here? Well, the virtual environment we created comes with its own Python interpreter, and the Python interpreter in your code editor is probably set to the system-wide Python interpreter. This means that the code editor doesn&#8217;t know where to find the libraries we just installed in the virtual environment.</p>



<p>To fix this, press <code>Ctrl+Shift+P</code> in VS Code to open the command palette, then type <code>Python: Select Interpreter</code> and select the Python interpreter from the virtual environment you created. You can find the correct one easily by comparing your root project name with the interpreter name. My root folder is <code>FINX_LANGGRAPH</code>, so I can find mine in the list under this name:</p>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img decoding="async" src="https://academy.finxter.com/wp-content/uploads/2024/04/Screenshot-2024-03-24-101438.jpg" alt="" class="wp-image-4344"/></figure>
</div>


<p>When you click this the red squiggly lines should go away and you&#8217;re now using the correct Python interpreter.</p>



<p>With that out of the way, let&#8217;s look at the imports here:</p>



<ul class="wp-block-list">
<li><code>StrOutputParser</code> is a class that will help us parse the output from the LLMs into a string format. Normally when you get the return from ChatGPT, we have to index into the <code>response.choices[0].message.content</code> to get the response. Just think of this as a convenience class that will help us with this.</li>



<li><code>ChatPromptTemplate</code> is a class that will help us create a template for our chat prompts. This will make it easier to create prompts for the LLMs.</li>



<li><code>ChatOpenAI</code> is a class that will basically just allow us to create an instance of <code>OpenAI</code> and use it with LangChain.</li>
</ul>



<p>The value here of these output parsers and prompt templates is that they are a unified interface that we can use in the same manner without changes even if we change the LLM we are using halfway through our project or in the future.</p>



<h3 class="wp-block-heading">Prompt templates</h3>



<p>We then import the <code>set_environment_variables</code> function from the <code>setup_environment</code> file we created earlier. Now let&#8217;s continue our code by creating a prompt template:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">set_environment_variables()


french_german_prompt = ChatPromptTemplate.from_template(
    "Please tell me the french and german words for {word} with an example sentence for each."
)</pre>



<p>First, we make sure to call our <code>set_environment_variables</code> function to set our API key. As a simple example prompt, I&#8217;ll create an example that asks for the French and German words for a given word, along with an example sentence for each. This is just a simple example to show the parts of LangChain before we get into more complex examples.</p>



<p>The <code>{word}</code> part is the template variable that we can replace with any word we want to ask about. We then create a <code>ChatPromptTemplate</code> instance using the <code>from_template</code> method and pass in our prompt string. The <code>ChatPromptTemplate</code> class will help us create prompts for the LLMs in a more convenient way and basically deals with formatting message history like this:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">## Example of a ChatPromptTemplate
template = ChatPromptTemplate.from_messages([
            ("system", "You are a helpful AI bot. Your name is {name}."),
            ("human", "Hello, how are you doing?"),
            ("ai", "I'm doing well, thanks!"),
            ("human", "{user_input}"),
        ])</pre>



<p>We need only a single message here though, which is why we use the <code>from_template</code> method. In this case, LangChain will assume this to be a human message so this will result in:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">template = ChatPromptTemplate.from_messages([
            ("human", "Please tell me the french and german words for {word} with an example sentence for each.")
        ])</pre>



<h3 class="wp-block-heading">Creating a chain</h3>



<p>Now that we have a prompt template to create our prompts, let&#8217;s continue:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">llm = ChatOpenAI(model="gpt-3.5-turbo-0125")
output_parser = StrOutputParser()

french_german_chain = french_german_prompt | llm | output_parser</pre>



<p>First, we define our LLM instance using the <code>ChatOpenAI</code> class and pass in the model we want to use. I&#8217;ll be using <code>gpt-3.5-turbo-0125</code> as it is more than enough for the simple test we&#8217;re doing here. If at any part in the course you want to use GPT-4-turbo instead then feel free to do so.</p>



<p>We&#8217;ve already set the API key to the environment variable so we don&#8217;t need to worry about it. We then create an instance of the <code>StrOutputParser</code> class to parse the output from the LLMs into a string response as discussed earlier.</p>



<p>Now that we have three building blocks, it is time for one of LangChain&#8217;s important concepts, &#8220;chains&#8221;. We can simply use the <code>|</code> operator to chain these building blocks together. This operator is taken from the <code>pipe</code> operator in Unix, which is used to chain commands together.</p>



<p>In this case, we take the <code>french_german_prompt</code> as the entry point of our chain, and we pipe the resulting prompt into our <code>llm</code>, making an LLM call. We then pipe the output into our <code>output_parser</code> to get the string response. Notice how easy and readable the chain is. We use chains to build stuff with large language models, hence the name LangChain. This piping style of syntax above is often referred to as <code>LCEL</code> or LangChain Expression Language.</p>



<h3 class="wp-block-heading">Running the chain</h3>



<p>Now let&#8217;s actually try and run this chain. To do this we can simply use the <code>invoke</code> method on our chain:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">result = french_german_chain.invoke({"word": "polar bear"})
print(result)</pre>



<p>We can technically also just pass in the string <code>"polar bear"</code> as we only have a single variable, but it&#8217;s better practice to use a dictionary like this as you may have multiple variables in your prompt. So go ahead and run this Python file and you should get something like the following:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">French: ours polaire
German: Eisbär

Example sentence in French: L'ours polaire est un animal emblématique de l'Arctique.
Example sentence in German: Der Eisbär ist das größte an Land lebende Raubtier der Welt.</pre>



<p>The order or structure may be slightly different as we didn&#8217;t specify any specific desired output structure, but that&#8217;s not the point here, it works! You&#8217;ll notice LangChain is very easy to read and understand, and this exact same code can be used with other LLMs with little modification.</p>



<p>We can also very easily stream the response instead. Edit your code like this, commenting out the previous <code>invoke</code> call and calling <code>stream</code> instead:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""># result = french_german_chain.invoke({"word": "polar bear"})
# print(result)

for chunk in french_german_chain.stream({"word": "polar bear"}):
    print(chunk, end="", flush=True)</pre>



<p>So for every <code>chunk</code> in the stream that results from calling <code>french_german_chain.stream</code> with the word &#8220;polar bear&#8221;, we print the chunk to the console. The <code>end=""</code> and <code>flush=True</code> are just to make sure there are no line breaks in between print messages and that the output is printed immediately to the console.</p>



<p>Now if you run it again, you&#8217;ll see the tokens being streamed and written to your console in real time.</p>



<p>Another useful method provided for us is <code>batch</code>, so let&#8217;s give that a spin as well:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""># for chunk in french_german_chain.stream({"word": "polar bear"}):
#     print(chunk, end="", flush=True)

print(
    french_german_chain.batch(
        [{"word": "computer"}, {"word": "elephant"}, {"word": "carrot"}]
    )
)</pre>



<p>This time we pass in a list of dictionaries with one entry for each run in the batch. Running this will give the responses in a list, one for each entry in the batch:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">["French: \nComputer - Ordinateur \nExample sentence: J'utilise mon ordinateur pour travailler et regarder des films.\n\nGerman:\nComputer - Computer \nExample sentence: Mein Computer ist schon ein paar Jahre alt, aber er funktioniert immer noch einwandfrei.", "French: éléphant\nExample sentence: J'ai vu un éléphant au zoo.\n\nGerman: Elefant\nExample sentence: Der Elefant im Zoo war sehr groß.", "French: carotte\nExample sentence: J'ai acheté des carottes pour faire une soupe.\n\nGerman: Karotte\nExample sentence: Ich esse gerne Karotten als Snack."]</pre>



<p>Now go ahead and comment that one out as well and let&#8217;s check the properties of our chain:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""># print(
#     french_german_chain.batch(
#         [{"word": "computer"}, {"word": "elephant"}, {"word": "carrot"}]
#     )
# )

print("input_schema", french_german_chain.input_schema.schema())
print("output_schema", french_german_chain.output_schema.schema())</pre>



<p>And if we run that we get a JSON schema that shows the in and outputs of our chain:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">input_schema {'title': 'PromptInput', 'type': 'object', 'properties': {'word': {'title': 'Word', 'type': 'string'}}}
output_schema {'title': 'StrOutputParserOutput', 'type': 'string'}</pre>



<p>We can see that the input takes a single object variable that needs to have a key <code>word</code> with a string value. If we add more variables to our prompt, we&#8217;ll see them in the schema as well. The output schema is a simple string because we used the <code>StrOutputParser</code> to parse the output into a string in the end.</p>



<h3 class="wp-block-heading">Adding complexity</h3>



<p>That is the basics of an extremely simple chain in LangChain. So let&#8217;s make it a bit more complex here. In this same file let&#8217;s declare a second chain and let&#8217;s say for the sake of a simple demonstration that this second chain is supposed to check if the output of the first chain is correct or not. (We&#8217;re just using simple examples here to save time and get to the good stuff faster).</p>



<p>So down below the other stuff in the <code>langchain_basics.py</code> file, let&#8217;s define the prompt template for our second chain:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""># print("input_schema", french_german_chain.input_schema.schema())
# print("output_schema", french_german_chain.output_schema.schema())


check_if_correct_prompt = ChatPromptTemplate.from_template(
    """
    You are a helpful assistant that looks at a question and its given answer. You will find out what is wrong with the answer and improve it. You will return the improved version of the answer.
    Question:\n{question}\nAnswer Given:\n{initial_answer}\nReview the answer and give me an improved version instead.
    Improved answer:
    """
)</pre>



<p>This time we have two variables in our prompt, <code>question</code> and <code>initial_answer</code>. We ask it to give an improved version of the first answer. The first answer is likely to be perfect already but again this is just for the sake of a quick demonstration.</p>



<p>We can reuse the <code>llm</code> and <code>output_parser</code> instances we created earlier, so let&#8217;s just create a new chain with the new prompt:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">check_answer_chain = check_if_correct_prompt | llm | output_parser</pre>



<p>Now we will need to run the input through the first chain, and then we need to keep both the original prompt from the first chain and the answer we get back from the first chain to pass them into the second one. So let&#8217;s do that:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">def run_chain(word: str) -> str:
    initial_answer = french_german_chain.invoke({"word": word})
    print("initial answer:", initial_answer, end="\n\n")
    answer = check_answer_chain.invoke(
        {
            "question": f"Please tell me the french and german words for {word} with an example sentence for each.",
            "initial_answer": initial_answer,
        }
    )
    print("improved answer:", answer)
    return answer</pre>



<p>So we define a function <code>run_chain</code> that takes a word as string input and will return a string. The initial answer is our return after we invoke the <code>french_german_chain</code> with the word.</p>



<p>We then print this answer and pass it into the <code>check_answer_chain</code> along with the original prompt, by passing both through a dictionary with the appropriate keys matching our prompt template. We print the improved answer and return it.</p>



<p>Now let&#8217;s run this function with a word:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">run_chain("strawberries")</pre>



<p>I apologize if I suddenly gave you a craving for strawberries! <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f353.png" alt="🍓" class="wp-smiley" style="height: 1em; max-height: 1em;" /><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f353.png" alt="🍓" class="wp-smiley" style="height: 1em; max-height: 1em;" /><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f353.png" alt="🍓" class="wp-smiley" style="height: 1em; max-height: 1em;" /> Run it and your output will be something like this:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">initial answer: French: fraises
Example sentence: J'adore manger des fraises en été.

German: Erdbeeren
Example sentence: Im Sommer esse ich gerne Erdbeeren mit Sahne.

improved answer: French: fraises
Example sentence: J'adore manger des fraises en été.

German: Erdbeeren
Example sentence: Im Sommer esse ich gerne Erdbeeren.</pre>



<p>Now of course both of them are fine and there wasn&#8217;t really anything to improve as the question is very simple, but we successfully ran a chain through another chain.</p>



<p>So that works fine, but you can see passing the values around to the second chain is a bit cumbersome. Now imagine we want to add a 3rd step to the chains above or even a 4th one. A conditional split path perhaps? If <code>x</code> then call chain <code>a</code> and else call chain <code>b</code>.</p>



<p>Using the above method would be a bit of a mess, so we&#8217;d have to create some kind of state object instead that has all the data in a single object so that we can pass this around between chains, with each chain adding or modifying the state object as needed.</p>



<p>This is actually a pretty good solution to the problem and as it happens, this is pretty much what LangGraph will do for us. Before we get there though, we need to take a short detour to LangSmith and also learn how to write our own <code>tools</code> in LangChain so we can use the power of function calling and agents to fully leverage the power of LangGraph and create some really cool stuff. That&#8217;s it for part 1 of this course, I hope you enjoyed it and I&#8217;ll see you in the next one!</p>



<h2 class="wp-block-heading">LangSmith and Writing Tools</h2>



<p>Hi and welcome back to part 2 of the tutorial series where we will be having a look at LangSmith which will help us debug our LLM creations and also write tools that our powerful agents will be able to execute from part 3 onwards.</p>



<h3 class="wp-block-heading">LangSmith setup</h3>



<p>So what is LangSmith? LangSmith is another part of the LangChain ecosystem that will help us during the development and debugging of our LLM applications</p>



<ul class="wp-block-list">
<li><strong>LLM Debugging and Testing</strong>: It will make it easier to identify and fix errors and test our applications to ensure they work as expected.</li>



<li><strong>Monitoring and Evaluation</strong>: It also provides tools to monitor performance and effectiveness, especially helpful if your project needs fast response times.</li>



<li><strong>Easy integration</strong>: LangSmith integrates seamlessly with LangChain and is very easy to set up as you will see.</li>
</ul>



<p>First we&#8217;ll need to get an API key for LangSmith, so it can keep track of our traces for us using our unique identifier. This is free for single-user accounts with up to 3000 traces per month, which is more than enough for general development and testing. You shouldn&#8217;t have to provide any payment details unless you want to switch to a heavier plan later on.</p>



<p>Go to https://smith.langchain.com/ and sign up using your GitHub, Google, or email address:</p>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img decoding="async" src="https://academy.finxter.com/wp-content/uploads/2024/04/Screenshot-2024-03-24-160531.jpg" alt="" class="wp-image-4348"/></figure>
</div>


<p>After you have made your account and logged in at smith.langchain.com find the <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/2699.png" alt="⚙" class="wp-smiley" style="height: 1em; max-height: 1em;" />gear icon in the bottom left corner and click it, then find the <code>Create Api Key</code> button to generate your API key:</p>


<div class="wp-block-image">
<figure class="aligncenter size-large"><img decoding="async" src="https://academy.finxter.com/wp-content/uploads/2024/04/Screenshot-2024-03-24-160745-1024x769.jpg" alt="" class="wp-image-4349"/></figure>
</div>


<p>Copy your API key and then let&#8217;s open our existing <code>.env</code> file in the root of our project and edit it by adding the LangSmith API key (no spaces or quotation marks):</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">OPENAI_API_KEY=your_api_key_here
LANGCHAIN_API_KEY=your_api_key_here</pre>



<p>Save and close your <code>.env</code> file. We don&#8217;t need to install LangSmith as it is already included in the LangChain package. Let&#8217;s move on to our existing <code>setup_environment.py</code> file to add the LangSmith setup to our reusable setup script.</p>



<p>In order to enable LangSmith tracing, we need to do three things.</p>



<ul class="wp-block-list">
<li>Provide the LangSmith API key</li>



<li>Set the tracing environment variable to <code>true</code></li>



<li>Set the project name so we can distinguish between different projects in our LangSmith dashboard</li>
</ul>



<p>Replace all the code so far in the <code>setup_environment.py</code> file with the following:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">import os
from datetime import date

from decouple import config


def set_environment_variables(project_name: str = "") -> None:
    if not project_name:
        project_name = f"Test_{date.today()}"

    os.environ["OPENAI_API_KEY"] = str(config("OPENAI_API_KEY"))

    os.environ["LANGCHAIN_TRACING_V2"] = "true"
    os.environ["LANGCHAIN_API_KEY"] = str(config("LANGCHAIN_API_KEY"))
    os.environ["LANGCHAIN_PROJECT"] = project_name

    print("API Keys loaded and tracing set with project name: ", project_name)</pre>



<p>We added the <code>date</code> from <code>datetime</code> import so we can use the date as the project name. Then we added an argument <code>project_name</code> to the function so we can set a custom project name for the LangChain dashboard. If no project name is provided, it will default to <code>Test_{date.today()}</code> so we still have something to distinguish it by even if we forget to set the name.</p>



<p>The <code>OPENAI_API_KEY</code> environment variable was already there, but now we have added three more environment variables for LangSmith. <code>LANGCHAIN_TRACING_V2</code> enables LangSmith tracing when set to <code>true</code>, and then we have the <code>LANGCHAIN_API_KEY</code> and <code>LANGCHAIN_PROJECT</code> environment variables which LangSmith will read to know who we are and group the traces per project in our dashboard.</p>



<p>Make sure you use the exact same names for the environment variables. Save and close the file. Now let&#8217;s see what LangSmith will do for us by giving it a test run. Open the <code>langchain_basics.py</code> file that we created in part 1 and change only the following line:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">set_environment_variables()</pre>



<p>to add a project name:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">set_environment_variables("Simple LangChain test")</pre>



<p>Now go ahead and run the <code>langchain_basics.py</code> file from part 1 again without changing anything about the code. LangSmith will now trace the execution of the code as we are using the updated <code>set_environment_variables</code> script.</p>



<p>After running the script, go to the LangSmith dashboard at https://smith.langchain.com/ and make sure you&#8217;re logged in. In your dashboard you will see the project name you set in the overview:</p>


<div class="wp-block-image">
<figure class="aligncenter size-large"><img decoding="async" src="https://academy.finxter.com/wp-content/uploads/2024/04/Screenshot-2024-03-24-164135-1024x770.jpg" alt="" class="wp-image-4350"/></figure>
</div>


<p>We can see that our <code>Simple LangChain test</code> project has been run a total of <code>2</code> times (1 run for each chain), with an error rate of <code>0%</code>. We can see how many of the responses were streamed and how many tokens have been used in total for this project name.</p>



<p>Scrolling to the right reveals additional details:</p>


<div class="wp-block-image">
<figure class="aligncenter size-large"><img decoding="async" src="https://academy.finxter.com/wp-content/uploads/2024/04/Screenshot-2024-03-24-164716-1024x770.jpg" alt="" class="wp-image-4351"/></figure>
</div>


<p>We can see that our total cost for all runs on this project so far is $0.000237 and we have a latency of around 3 seconds per run. We also have the most recent run for reference. Go ahead and click the project for more details:</p>


<div class="wp-block-image">
<figure class="aligncenter size-large"><img decoding="async" src="https://academy.finxter.com/wp-content/uploads/2024/04/Screenshot-2024-03-24-165222-1024x770.jpg" alt="" class="wp-image-4352"/></figure>
</div>


<p>We have two entries, one for the <code>french_german_chain</code> and one for the <code>check_answer_chain</code>. When we use graphs later these will no longer be separate but combined into a single trace. Go ahead and click the lower one with and input of <code>strawberries</code> to see the details:</p>



<figure class="wp-block-image size-large"><img decoding="async" src="https://academy.finxter.com/wp-content/uploads/2024/04/Screenshot-2024-03-24-165545-1024x770.jpg" alt="" class="wp-image-4353"/></figure>



<p>We can see the <code>RunnableSequence</code> which is the overall chain, and then the three sub-elements that we had in our chain, the <code>ChatPromptTemplate</code>, the LLM, and the <code>StrOutputParser</code>. On this page we see the input and output for the entire chain, and if you click on any of the steps like <code>ChatOpenAI</code> you will see the in- and output for that specific step:</p>



<figure class="wp-block-image size-large"><img decoding="async" src="https://academy.finxter.com/wp-content/uploads/2024/04/Screenshot-2024-03-24-170143-1024x770.jpg" alt="" class="wp-image-4354"/></figure>



<p>Now our trace here is not that helpful as it is both very simple and broken up into two separate parts for each chain we ran, but this will be very helpful for easy feedback and debugging when we get to our graphs, which will combine complex systems into a single trace.</p>



<h3 class="wp-block-heading">Tools &#8211; Image generator</h3>



<p>Now let&#8217;s continue on and take a look at tools. If we want to have powerful multi AI-agent teams working away for us we need to be able to give them tools or functions to call. Naturally LangChain also comes with a handy integration for writing tools using a somewhat more pleasant syntax than the vanilla OpenAI tools.</p>



<p>We will be writing two tools, both of which we will use in our LangGraph graph in the next part. One of the tools will use Dall-e to generate an image (using our OpenAI key we already have) and download and save the image to disk. The other tool is going to get the current weather in a certain location. There are multiple ways in which tools can be defined in LangChain, but we will be using the latest convenient syntax here using the <code>@tool</code> decorator.</p>



<p>First let&#8217;s create a new folder called <code>images</code> and another one called <code>tools</code> in the root of our project, and then inside the <code>tools</code> folder create a new file named <code>image.py</code>:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c2.png" alt="📂" class="wp-smiley" style="height: 1em; max-height: 1em;" /> FINX_LANGGRAPH
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c2.png" alt="📂" class="wp-smiley" style="height: 1em; max-height: 1em;" /> images          <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/2728.png" alt="✨" class="wp-smiley" style="height: 1em; max-height: 1em;" />New empty folder
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c2.png" alt="📂" class="wp-smiley" style="height: 1em; max-height: 1em;" /> tools           <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/2728.png" alt="✨" class="wp-smiley" style="height: 1em; max-height: 1em;" />New folder
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> image.py    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/2728.png" alt="✨" class="wp-smiley" style="height: 1em; max-height: 1em;" />New file
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> .env
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> langchain_basics.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> Pipfile
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> Pipfile.lock
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> setup_environment.py</pre>



<p>In the <code>image.py</code> file we will define our first tool and see how this works. Let&#8217;s get started with our imports:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">import uuid
from pathlib import Path

import requests
from decouple import config
from langchain.tools import tool
from openai import OpenAI
from pydantic import BaseModel, Field</pre>



<p>As we will also download the image, we import <code>uuid</code> to create a unique filename so we don&#8217;t get clashes. We will use <code>pathlib</code> to define the path where we will save the image and <code>requests</code> to send an HTTP request to download the generated image from the internet.</p>



<p>We also import <code>config</code> from <code>decouple</code> to read our <code>.env</code> file, <code>tool</code> from <code>langchain.tools</code> to define our tool, <code>OpenAI</code> from <code>openai</code> to make a request to Dall-e, and <code>BaseModel</code> and <code>Field</code> from <code>pydantic</code> to define the input of our tool.</p>



<p><code>requests</code> is already installed as a dependency of LangChain itself, and we already installed <code>openai</code>. Let&#8217;s make sure we install <code>pydanctic</code> as well by running:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">pipenv install pydantic==1.10.13</pre>



<p>Make sure you use this version as it plays nicely with the current LangChain versions. If you install V2 instead you will have to use different imports from mine.</p>



<p>As this is the only place where we will use the vanilla OpenAI client, we&#8217;ll just declare it here instead of integrating it into the setup_environment.py script. Add the following:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">IMAGE_DIRECTORY = Path(__file__).parent.parent / "images"
CLIENT = OpenAI(api_key=str(config("OPENAI_API_KEY")))</pre>



<p>To get a path to the <code>images</code> folder in the root of our project we first use <code>Path(__file__)</code> to get the path to the current file, then <code>parent</code> to go up one level to the <code>tools</code> folder, and then another <code>parent</code> to go up to the root of our project. We then add <code>/images</code> to get the path to the <code>images</code> folder.</p>



<p>We also create a <code>CLIENT</code> object using the <code>OpenAI</code> class and our API key from the <code>.env</code> file.</p>



<h3 class="wp-block-heading">Image downloader</h3>



<p>Let&#8217;s first create a helper function that takes an image URL and downloads and saves that image to our <code>/images</code> folder. This is not our tool but just a quick helper we can call from inside our tool later on. continuing in <code>image.py</code> add the following:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">def image_downloader(image_url: str | None) -> str:
    if image_url is None:
        return "No image URL returned from API."
    response = requests.get(image_url)
    if response.status_code != 200:
        return "Could not download image from URL."
    unique_id: uuid.UUID = uuid.uuid4()
    image_path = IMAGE_DIRECTORY / f"{unique_id}.png"
    with open(image_path, "wb") as file:
        file.write(response.content)
    return str(image_path)</pre>



<p>We define a function <code>image_downloader</code> that takes an image URL as input and returns a string with the path to the downloaded image. If the image URL is <code>None</code> we return a message saying that no image URL was returned from the API. We then use <code>requests.get</code> to download the image from the URL and check if the status code is <code>200</code> which means the request was successful, again sending a message if it was not successful.</p>



<p>We then create a unique ID using by instantiating a new <code>UUID</code> class object using <code>uuid.uuid4()</code>. We then create a path to the image using the <code>IMAGE_DIRECTORY</code> we defined earlier and the unique ID with a <code>.png</code> extension. Finally, we open the file in write binary mode (<code>wb</code>) and write the content of the response to the file, returning the path to the image as a string.</p>



<p>The reason we do not raise an error but send a string if the download fails is that an error will blow up our LLM application, but if we return a string instead the LLM agent will see that something went wrong and it can try to fix it or try calling the function again.</p>



<h3 class="wp-block-heading">Input interface</h3>



<p>Before defining our tool itself, we&#8217;re going to define the exact input interface that our tool will accept. Behind the scenes LangChain will use this to generate the JSON schema that the OpenAI API requires for function and tool calling. Add the following:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">class GenerateImageInput(BaseModel):
    image_description: str = Field(
        description="A detailed description of the desired image."
    )</pre>



<p>We use <code>pydantic</code> to define a <code>GenerateImageInput</code> class which inherits from <code>BaseModel</code> This will allow us to clearly define the input arguments our tool will need in order to run, as the LLM will need this information when calling a tool or deciding whether to call a tool or not.</p>



<p>We define a single field <code>image_description</code> which is a string and we use <code>Field</code> to add a description to the field. So we want an input argument of <code>image_description</code> which is a string that describes the image we want to generate. If you need multiple arguments you can define these here as well in the same fashion. For our uses, this one argument will do here.</p>



<h3 class="wp-block-heading">Tool definition</h3>



<p>Now it&#8217;s time to write our actual tool using the <code>@tool</code> decorator. Add the following:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">@tool("generate_image", args_schema=GenerateImageInput)
def generate_image(image_description: str) -> str:
    """Generate an image based on a detailed description."""
    response = CLIENT.images.generate(
        model="dall-e-3",
        prompt=image_description,
        size="1024x1024",
        quality="standard",  # standard or hd
        n=1,
    )
    image_url = response.data[0].url
    return image_downloader(image_url)</pre>



<p>We start with the <code>@tool</code> decorator which takes the name of the tool as the first argument and the schema of the input arguments as the second argument, passing in our <code>GenerateImageInput</code> class we defined earlier.</p>



<p>After that, we declare the function itself, which takes a string as input with the image description and will return an image path in string format. Note that we included a docstring that describes what the tool does: <code>"""Generate an image based on a detailed description."""</code>.</p>



<p>This docstring is required when defining tools using the <code>@tool</code> decorator and is the description that will be used for the OpenAI tool schema generated behind the scenes that helps the LLM agent choose which function(s) to call. For this reason you must make sure it is an adequate description of what the tool does and what it&#8217;s purpose is.</p>



<p>After that we simply make a vanilla Dall-e image generation API request using <code>CLIENT.images.generate</code> with the model set to <code>dall-e-3</code>, the prompt set to the <code>image_description</code> we received as input, the size set to <code>1024x1024</code>, the quality set to <code>standard</code>, and the number of images to generate set to <code>1</code>. You can of course call on any image generation API you want, but as we already have an OpenAI key set we will use Dall-e here to keep things simple.</p>



<p>We then extract the URL by accessing <code>response.data[0].url</code> and return the result of calling the <code>image_downloader</code> function we defined earlier with the image URL as input. As the <code>image_downloader</code> function will save the image to file and return a path to it in stringform that fulfills our promise of having the <code>generate_image</code> function return a string file path to the image requested.</p>



<h3 class="wp-block-heading">Test run</h3>



<p>Tools are just functions except we clearly defined the input arguments, name, and the purpose of the function using a docstring. Now let&#8217;s give our tool a test run by adding the following to the bottom of the file:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">if __name__ == "__main__":
    print(generate_image.run("A picture of sharks eating pizza in space."))</pre>



<p>If this file is the main file being run, the <code>generate_image</code> function will be called for a quick test. If we import the tool from elsewhere this code block will not be triggered. Note that we call the <code>run</code> method on a tool in order to run it, this is part of the defined interface for LangChain tools.</p>



<p>So go ahead and run this file and you should see an image appear in the <code>images</code> folder in the root of your project, indicating that it worked. Make sure you didn&#8217;t forget to create the empty <code>images</code> folder in the root of your project.</p>



<p>My image here is pretty epic, I must say <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f988.png" alt="🦈" class="wp-smiley" style="height: 1em; max-height: 1em;" /><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f355.png" alt="🍕" class="wp-smiley" style="height: 1em; max-height: 1em;" /><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f680.png" alt="🚀" class="wp-smiley" style="height: 1em; max-height: 1em;" />:</p>



<figure class="wp-block-image size-full"><img decoding="async" src="https://academy.finxter.com/wp-content/uploads/2024/04/23f239d8-66df-4d58-98d8-98f9ae5f0530.png" alt="" class="wp-image-4355"/></figure>



<p>It is interesting to see that Dall-e choose peperoni pizza as a default pizza. Sorry if I made you hungry yet again <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f605.png" alt="😅" class="wp-smiley" style="height: 1em; max-height: 1em;" /><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f355.png" alt="🍕" class="wp-smiley" style="height: 1em; max-height: 1em;" /><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f355.png" alt="🍕" class="wp-smiley" style="height: 1em; max-height: 1em;" />.</p>



<h3 class="wp-block-heading">Weather tool</h3>



<p>Ok with that settled, save and close up this file, and let&#8217;s move on to our second tool which will get the current weather in a certain location. We&#8217;ll go through this one quickly as the process is very similar to the first tool.</p>



<p>First, sign up for a free account at https://www.weatherapi.com/. They will give you pro for 14 days for free but it will automatically switch back to free afterward and you don&#8217;t have to provide any payment or credit card information, so don&#8217;t worry about it, the sign up will be pretty fast and totally free.</p>



<p>Signup and then get yourself an API key:<br><img decoding="async" src="Screenshot 2024-03-24 193749.jpg" alt="WeatherApi.com"></p>



<figure class="wp-block-image size-large"><img decoding="async" src="https://academy.finxter.com/wp-content/uploads/2024/04/Screenshot-2024-03-24-193749-1024x691.jpg" alt="" class="wp-image-4356"/></figure>



<p>Now add your new API key to your <code>.env</code> file:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">OPENAI_API_KEY=your_api_key_here
LANGCHAIN_API_KEY=your_api_key_here
WEATHER_API_KEY=your_api_key_here</pre>



<p>Save and close that and now lets create a new file in the <code>tools</code> folder called <code>weather.py</code>:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c2.png" alt="📂" class="wp-smiley" style="height: 1em; max-height: 1em;" /> FINX_LANGGRAPH
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c2.png" alt="📂" class="wp-smiley" style="height: 1em; max-height: 1em;" /> images
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c2.png" alt="📂" class="wp-smiley" style="height: 1em; max-height: 1em;" /> tools
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> image.py
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> weather.py    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/2728.png" alt="✨" class="wp-smiley" style="height: 1em; max-height: 1em;" />New file
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> .env
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> langchain_basics.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> Pipfile
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> Pipfile.lock
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> setup_environment.py</pre>



<p>In the <code>weather.py</code> file we will define our second tool. Let&#8217;s get started with our imports:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">from json import dumps

import requests
from decouple import config
from langchain.tools import tool
from pydantic import BaseModel, Field</pre>



<p>We import <code>dumps</code> from <code>json</code> too which will allow us to convert a dictionary to string format, as LLMs can only handle strings. The rest of the imports are familiar from the <code>generate_image</code> tool we made. Let&#8217;s define the input interface for our weather tool using a pydantic model:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">class WeatherInput(BaseModel):
    location: str = Field(description="Must be a valid location in city format.")</pre>



<p>This is the same as the other tool, again make sure the description is a good one as the LLM agent will make use of this. Let&#8217;s define our function that will call the weather API and return the response. Add the following:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">@tool("get_weather", args_schema=WeatherInput)
def get_weather(location: str) -> str:
    """Get the current weather for a specified location."""
    if not location:
        return (
            "Please provide a location and call the get_current_weather_function again."
        )
    API_params = {
        "key": config("WEATHER_API_KEY"),
        "q": location,
        "aqi": "no",
        "alerts": "no",
    }
    response: requests.models.Response = requests.get(
        "http://api.weatherapi.com/v1/current.json", params=API_params
    )
    str_response: str = dumps(response.json())
    return str_response</pre>



<p>We start with the <code>@tool</code> decorator with the name of the tool and the input schema as before. We then define the function itself which takes a string as input with the location and will return a string with the weather data. We include a docstring that describes what the tool does and is for so the LLM agent can make use of this.</p>



<p>If the location is not provided we return a message asking the LLM to provide a location and call the function again. We then define the API parameters as a dictionary with the API key which we read from the <code>.env</code> file using <code>config</code>, the location (<code>q</code>), and two optional parameters <code>aqi</code> (air quality index) and <code>alerts</code> set to <code>no</code>.</p>



<p>We then make a request to the weather API using <code>requests.get</code> with the URL <code>http://api.weatherapi.com/v1/current.json</code> and the API parameters. This will return a <code>Response</code> object from <code>requests.models</code> which we can convert to a dictionary using it&#8217;s <code>.json()</code> method. We then convert the dictionary to a string using the <code>dumps</code> (dump string) function we imported and return the string with the weather data.</p>



<p>Let&#8217;s add a quick test just like with the other tool:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">if __name__ == "__main__":
    print(get_weather.run("New York"))</pre>



<p>Now go ahead and give it a test run and you should see something like the following:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">{"location": {"name": "New York", "region": "New York", "country": "United States of America", "lat": 40.71, "lon": -74.01, "tz_id": "America/New_York", "localtime_epoch": 1711278898, "localtime": "2024-03-24 7:14"}, "current": {"last_updated_epoch": 1711278000, "last_updated": "2024-03-24 07:00", "temp_c": -0.6, "temp_f": 30.9, "is_day": 1, "condition": {"text": "Sunny", "icon": "//cdn.weatherapi.com/weather/64x64/day/113.png", "code": 1000}, "wind_mph": 2.2, "wind_kph": 3.6, "wind_degree": 2, "wind_dir": "N", "pressure_mb": 1020.0, "pressure_in": 30.13, "precip_mm": 0.0, "precip_in": 0.0, "humidity": 49, "cloud": 0, "feelslike_c": -5.9, "feelslike_f": 21.5, "vis_km": 16.0, "vis_miles": 9.0, "uv": 2.0, "gust_mph": 15.8, "gust_kph": 25.4}}</pre>



<p>Excellent! We now have some functions for our agents to play around with while we explore building more complex systems using graphs.</p>



<h3 class="wp-block-heading">Simplifying tool imports</h3>



<p>There is one quick thing left to do before we move on to the next part. The way our <code>tools</code> folder is set up right now we would have to import the tools from the <code>tools</code> folder in a kind of awkward way:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""># Example, no need to copy - we will not use this code
from tools import weather, image

weather.get_weather("Alabama")
image.generate_image(
    "A T-rex made from kentucky fried chicken is attacking the white house."
)</pre>



<p>This <code>weather.get_weather</code> is kind of awkward so let&#8217;s create a <code>__init__.py</code> file in the <code>tools</code> folder to make it easier to import the tools. Create a new file called <code>__init__.py</code> in the <code>tools</code> folder:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c2.png" alt="📂" class="wp-smiley" style="height: 1em; max-height: 1em;" /> FINX_LANGGRAPH
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c2.png" alt="📂" class="wp-smiley" style="height: 1em; max-height: 1em;" /> images
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c2.png" alt="📂" class="wp-smiley" style="height: 1em; max-height: 1em;" /> tools
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> __init__.py    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/2728.png" alt="✨" class="wp-smiley" style="height: 1em; max-height: 1em;" />New file
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> image.py
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> weather.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> .env
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> langchain_basics.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> Pipfile
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> Pipfile.lock
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> setup_environment.py</pre>



<p>In the <code>__init__.py</code> file add the following:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">from .image import generate_image
from .weather import get_weather</pre>



<p>This will import the <code>generate_image</code> and <code>get_weather</code> tools from their respective files and make them available when importing the <code>tools</code> folder. It has effectively made the <code>tools</code> folder a package that can be imported from as a single entity.</p>



<p>Now the above example can be changed to this:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""># Example, no need to copy - we will not use this code
from tools import get_weather, generate_image

get_weather("Alabama")
generate_image("A T-rex made from kentucky fried chicken is attacking the white house.")</pre>



<p>This is a lot more sensible. Save and close the <code>__init__.py</code> file and we are done with this part. In the next part, it is time to dive into LangGraph and start building some more complex systems using agents and tool calls to interlink them into a graph that can do some cool stuff. See you there!</p>



<p>P.S. I know you are secretly curious what the T-rex made from KFC attacking the white house looks like <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f605.png" alt="😅" class="wp-smiley" style="height: 1em; max-height: 1em;" /><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f357.png" alt="🍗" class="wp-smiley" style="height: 1em; max-height: 1em;" /><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f996.png" alt="🦖" class="wp-smiley" style="height: 1em; max-height: 1em;" /><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f3db.png" alt="🏛" class="wp-smiley" style="height: 1em; max-height: 1em;" />. Here is is:</p>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img decoding="async" src="https://academy.finxter.com/wp-content/uploads/2024/04/0fc772bb-fc8e-4a86-b69e-7dd6ae8136d4.png" alt="" class="wp-image-4357"/></figure>
</div>


<p></p>



<p>Kentucky Fried T-rex, anyone?</p>



<h2 class="wp-block-heading">LangGraph Introduction</h2>



<p>Hello and welcome back to part 3 of this tutorial series. In this part, we&#8217;ll be getting started with LangGraph. Instead of having a lot of explanation before we start, we&#8217;ll see how stuff works as we go along. So without further ado, let&#8217;s just jump right in.</p>



<p>Let&#8217;s start by actually installing LangGraph, as it doesn&#8217;t get installed by default with LangChain. To install LangGraph, you can use the following command in your terminal:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">pipenv install langgraph==0.0.30 langchainhub==0.1.15</pre>



<p>Once you&#8217;ve installed LangGraph, let&#8217;s start by creating a new file called <code>simple_langgraph.py</code>:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c2.png" alt="📂" class="wp-smiley" style="height: 1em; max-height: 1em;" /> FINX_LANGGRAPH
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c2.png" alt="📂" class="wp-smiley" style="height: 1em; max-height: 1em;" /> images
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c2.png" alt="📂" class="wp-smiley" style="height: 1em; max-height: 1em;" /> tools
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> .env
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> langchain_basics.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> Pipfile
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> Pipfile.lock
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> setup_environment.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> simple_langgraph.py    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/2728.png" alt="✨" class="wp-smiley" style="height: 1em; max-height: 1em;" />New file</pre>



<p>Over the next three parts, we&#8217;ll be looking at different ways in which you can use LangGraph to chain LLMs and tools together. In this first part we&#8217;ll be looking at a simple classic <code>LLM</code> &#8211;&gt; goes to a <code>tool executor</code> &#8211;&gt; and then back to <code>LLM</code> type setup.</p>



<p>Open up <code>simple_langgraph.py</code> and let&#8217;s start by importing the necessary modules:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">import operator
from typing import Annotated, TypedDict, Union

from colorama import Fore, Style
from langchain import hub
from langchain.agents import create_openai_functions_agent
from langchain_core.agents import AgentAction, AgentActionMessageLog, AgentFinish
from langchain_core.messages import BaseMessage
from langchain_core.runnables.base import Runnable
from langchain_openai.chat_models import ChatOpenAI
from langgraph.graph import END, StateGraph
from langgraph.prebuilt.tool_executor import ToolExecutor

from setup_environment import set_environment_variables
from tools import generate_image, get_weather</pre>



<p>That is a lot of stuff! Don&#8217;t worry, most of it is actually not as complex as it seems. Usually, I&#8217;ll go over all the imports before we get started, but as there are quite a few to go through, I&#8217;ll cover each import when we get to the part where it&#8217;s used instead. For now, just have them copied.</p>



<p>Next, we&#8217;ll set the environment variables and define a couple of constants:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">set_environment_variables("LangGraph Basics")

LLM = ChatOpenAI(model="gpt-3.5-turbo-0125", streaming=True)
TOOLS = [get_weather, generate_image]
PROMPT = hub.pull("hwchase17/openai-functions-agent")</pre>



<p>We reused our <code>set_environment_variables</code> function from the previous part to set the environment variables and set the name for the LangSmith traces to <code>LangGraph Basics</code>. We then define our LLM just like we did in part 1, also setting the streaming parameter to <code>True</code>. We then define a list of tools which is literally just a list containing the two tools that we wrote.</p>



<h3 class="wp-block-heading">The LangChain Hub</h3>



<p>For the prompt template, we pull it from the LangChain Hub this time, mostly because I want to show you that it exists! The LangChain Hub is kind of like a mini-GitHub for storing LangChain <code>ChatPromptTemplates</code> just like the simple ones we wrote in part 1. You can push new commits to your templates and pull them like we just did here, kind of like GitHub.</p>



<p>You can go to https://smith.langchain.com/ and scroll down to find the Hub button:</p>



<figure class="wp-block-image size-large"><img decoding="async" src="https://academy.finxter.com/wp-content/uploads/2024/04/image-7-1024x576.png" alt="" class="wp-image-4361"/></figure>



<p>Click it to visually browse the prompts available on the hub:</p>



<figure class="wp-block-image size-large"><img decoding="async" src="https://academy.finxter.com/wp-content/uploads/2024/04/image-1024x771.png" alt="" class="wp-image-4362"/></figure>



<p>You can use this as a convenient place to store your prompts. You can also set them to private if you don&#8217;t want to share them with the world and you can even fork other public prompts that you like to your own repositories. It&#8217;s a handy tool for development. For production or highly sensitive company data, you might want to store your prompts in a more secure location.</p>



<p>If we look up the prompt we just pulled, we can see that it is a fairly simple prompt:</p>



<figure class="wp-block-image size-large"><img decoding="async" src="https://academy.finxter.com/wp-content/uploads/2024/04/image-1-1024x772.png" alt="" class="wp-image-4363"/></figure>



<p>It has an extremely basic system message of <code>"You are a helpful assistant"</code> and we can see that it has placeholders for <code>chat_history</code>, human <code>input</code> and an <code>agent_scratchpad</code>. The <code>chat_history</code> and <code>input</code> are kind of self-explanatory in that they hold the chat history so far and the human input, but what about this <code>agent_scratchpad</code>?</p>



<p>The <code>agent_scratchpad</code> is kind of like a place where the agent can take notes while going through its reasoning process of what action should be taken next and what functions should be called. Think of it as a notepad where the LLM can jot down its thoughts. Think of it kind of like the following:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">user:
    "Can you recommend me a zombie game from the year 2022?"

    > Entering new AgentExecutor chain...
    Thought: Oh, I love zombie games! There are so many great ones out there. Let me think about the best zombie game from 2022.
    Action: use_search_engine
    Action Input: "best zombie game 2022"

    Observation:[{list of search result objects for query "best zombie game 2022"}]
    There are three great zombie games from 2022 that I found: Zombie Cure Lab, Zombie Survivors, and SurrounDead. Let me think about which one to recommend.
    Action: use_search_engine
    Action Input: "Zombie Cure Lab"

    Observation:[{list of search result objects for query "Zombie Cure Lab"}]
    Zombie Cure Lab is a game where you manage a lab and try to cure the zombie virus. (Bunch more info here yadayada...) I recommend Zombie Cure Lab as the best zombie game from 2022.

    Final Answer: The best zombie game from 2022 is Zombie Cure Lab.</pre>



<p>This is just a conceptual example here to describe the idea, but the agent takes reasoning steps and makes observations along the way, first deciding to call a search engine tool to better answer the user question, then deciding to call the search engine tool to get more information on one of the games in particular, and then finally deciding that it has enough information to answer the user question.</p>



<p>So the <code>agent_scratchpad</code> is used to store these intermediate observations on what action to take next, but also to decide when the agent is done, so that it doesn&#8217;t just keep looping indefinitely. We&#8217;ll get back to how we can see when the agent is done in a moment.</p>



<h3 class="wp-block-heading">The State Object</h3>



<p>Ok, we have an LLM, some tools, and a prompt template. The next thing we need is a state object to keep track of the state for each step along our graph. So a LangGraph is kind of like a state machine, and it is going to take this state object and pass it along each node of the graph. Let&#8217;s look at a simplified example:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""># Simplified example
StateObject():
    user_input = "please do a for me"
    chat_history = [list of previous chat messages for context...]
    am_i_done = False
    steps_taken = []</pre>



<p>So say we have this state object above. We have received the user input question, and whatever chat history has come before if we have decided to implement memory. We have a flag <code>am_i_done</code> which is obviously set to <code>False</code> at the start, and we have a list of <code>steps_taken</code> which is empty at the start. Now we hand this state object to <code>node A</code> in our graph -&gt;</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""># Simplified example Node A
StateObject():
    user_input = "please do a for me"
    chat_history = [list of previous chat messages for context...]
    am_i_done = False
    steps_taken = ["action_a was taken"]</pre>



<p>It does some action we will just call action_a, which has taken it a step closer to answering the user question but it is not quite done yet so the <code>am_i_done</code> flag is still set to false. Now <code>node A</code> passes this state object to <code>node B</code> in our graph -&gt;</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""># Simplified example Node B
StateObject():
    user_input = "please do a for me"
    chat_history = [list of previous chat messages for context...]
    am_i_done = True
    steps_taken = ["action_a was taken", "action_b was taken"]</pre>



<p>This node does some action_b stuff and now has the final answer it needs to give to the user. It sets the <code>am_i_done</code> flag to <code>True</code> because it is done. We can use this <code>am_i_done</code> flag to test if the graph is completed yet (e.g. the user question or request has been fully answered).</p>



<p>So as the graph traverses over the nodes we define, each node will receive the state object, update it where needed, and then pass it along to the next node, or perhaps back to the previous one, or sideways to node D if a certain condition is met. So let&#8217;s define the real state object that we will be using:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">class AgentState(TypedDict):
    input: str
    chat_history: list[BaseMessage]
    agent_outcome: Union[AgentAction, AgentFinish, None]
    intermediate_steps: Annotated[list[tuple[AgentAction, str]], operator.add]</pre>



<p>We use a <code>TypedDict</code> to define a specific dictionary structure, defining the keys that this dictionary will have and the types of values that will be stored for each of those keys. The first entry is simply the user input, which is a <code>str</code> string value.</p>



<p>The second entry is the chat history, which is a <code>list</code> of <code>BaseMessage</code> objects. A <code>BaseMessage</code> object is just any one of the lines of this object below where you have a message and the originator of the message like &#8220;system&#8221;, &#8220;human&#8221;, or &#8220;ai&#8221;:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""># Example BaseMessages
("system", "You are a helpful AI bot. Your name is {name}."),
("human", "Hello, how are you doing?"),
("ai", "I'm doing well, thanks!"),
("human", "{user_input}"),</pre>



<p>The third item in the state object will be <code>agent_outcome</code>. The agent here will do its thing and then either return an <code>AgentAction</code> object or an <code>AgentFinish</code> object to us.</p>



<ul class="wp-block-list">
<li><strong>AgentAction:</strong> An <code>AgentAction</code> object simply contains the name of the tool the agent wants to call and the input arguments for that tool call, maybe like <code>get_weather</code> and <code>{"location": "New York"}</code>.</li>



<li><strong>AgentFinish:</strong> An <code>AgentFinish</code> object simply means that the agent considers its task finished and holds the final <code>return_values</code> inside.</li>
</ul>



<p>Using this <code>agent_outcome</code> object we can see what the next step is or if it is done.</p>



<p>The fourth and last entry in the <code>AgentState</code> object is a bit easier to read from the inside. We have a <code>list</code> of <code>tuples</code> where each tuple contains an <code>AgentAction</code> object and a <code>str</code> string. The <code>AgentAction</code> here is the same object that we described in the step above, containing a tool to be called and its input arguments. The difference here is that the step is already taken and the string which is the second item in the tuple is the tool output after it was called. So something like this:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">## Fictional example object
[
    (
        AgentAction(tool="get_weather", input={"location": "New York"}),
        "{API response JSON object...}",
    ),
    (
        AgentAction(tool="generate_image", input={"image_description": "cat"}),
        "Path/to/image.png",
    ),
]</pre>



<p>The <code>Annotated</code> type hint is used to add metadata to the type hint. In this case, we are using the <code>operator.add</code> function to tell the type checker that this list will be added to, so we are describing the <code>AgentState</code> object&#8217;s <code>intermediate_steps</code> list as a list that will be added to, like the example above.</p>



<h3 class="wp-block-heading">The Agent</h3>



<p>Now that we have our state object defined, we will define our agent that will have access to both the <code>generate_image</code> and <code>get_weather</code> tools:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">runnable_agent: Runnable = create_openai_functions_agent(LLM, TOOLS, PROMPT)</pre>



<p>We use the <code>create_openai_functions_agent</code> function we imported from LangChain to create an agent that has access to the LLM, the tools, and the prompt we defined so far. LangChain will make this into an OpenAI compatible agent by combining them for us into a <code>Runnable</code> type object. We have seen this <code>Runnable</code> object before in part 1 in the form of our chains. All <code>Runnable</code> type objects have the <code>invoke</code>, <code>stream</code>, and <code>batch</code> methods just like the chains we used in part 1.</p>



<p>Before we move on with the nodes and graph let&#8217;s test the agent we have so far. We&#8217;ll manually create a quick input here (as we haven&#8217;t built our graph yet) and then call <code>invoke</code> on the agent:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">inputs = {
    "input": "give me the weather for New York please.",
    "chat_history": [],
    "intermediate_steps": [],
}

agent_outcome = runnable_agent.invoke(inputs)
print(agent_outcome)</pre>



<p>Now go ahead and run this to test the agent so far and you should see something like this:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">API Keys loaded and tracing set with project name:  LangGraph Basics
tool='get_weather' tool_input={'location': 'New York'} log="\nInvoking: `get_weather` with `{'location': 'New York'}`\n\n\n" message_log=[AIMessage(content='', additional_kwargs={'function_call': {'arguments': '{"location":"New York"}', 'name': 'get_weather'}}, response_metadata={'finish_reason': 'function_call'})]</pre>



<p>We can see the agent wants to call the <code>get_weather</code> tool with the input <code>{"location": "New York"}</code>, so it&#8217;s asking us to call this function with these input arguments. Of course, it stopped running there as we haven&#8217;t linked up any other nodes yet, but we know that the agent is working so far.</p>



<p>Go ahead and remove the test <code>inputs</code> and <code>agent_outcome</code> code. Just for clarity, here is what you should have so far:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">import operator
from typing import Annotated, TypedDict, Union

from colorama import Fore, Style
from langchain import hub
from langchain.agents import create_openai_functions_agent
from langchain_core.agents import AgentAction, AgentActionMessageLog, AgentFinish
from langchain_core.messages import BaseMessage
from langchain_core.runnables.base import Runnable
from langchain_openai.chat_models import ChatOpenAI
from langgraph.graph import END, StateGraph
from langgraph.prebuilt.tool_executor import ToolExecutor

from setup_environment import set_environment_variables
from tools import generate_image, get_weather


set_environment_variables("LangGraph Basics")

LLM = ChatOpenAI(model="gpt-3.5-turbo-0125", streaming=True)
TOOLS = [get_weather, generate_image]
PROMPT = hub.pull("hwchase17/openai-functions-agent")


class AgentState(TypedDict):
    input: str
    chat_history: list[BaseMessage]
    agent_outcome: Union[AgentAction, AgentFinish, None]
    intermediate_steps: Annotated[list[tuple[AgentAction, str]], operator.add]


runnable_agent: Runnable = create_openai_functions_agent(LLM, TOOLS, PROMPT)</pre>



<h3 class="wp-block-heading">The Nodes</h3>



<p>So now the first thing we need to do is to create some nodes here so we can string them together into a graph. Let&#8217;s start with the Agent Node:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">def agent_node(input: AgentState):
    agent_outcome: AgentActionMessageLog = runnable_agent.invoke(input)
    return {"agent_outcome": agent_outcome}</pre>



<p>We define the node as a simple function that takes <code>input</code> which will be the <code>AgentState</code> object for all nodes. It then calls the <code>invoke</code> method on the agent with the input and catches the return in a variable named <code>agent_outcome</code> which is of type <code>AgentActionMessageLog</code>. This <code>agent_outcome will</code> have either the <code>AgentAction</code> object or the <code>AgentFinish</code> object that we talked about earlier, indicating what the next step is or if the agent is done. Whatever is in the <code>agent_outcome</code>, this function simply returns it in a dictionary.</p>



<p>Now that we have an agent node we need another node to execute the tools that the agent wants to call. Let&#8217;s define the Tool Executor Node:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">tool_executor = ToolExecutor(TOOLS)

def tool_executor_node(input: AgentState):
    agent_action = input["agent_outcome"]
    output = tool_executor.invoke(agent_action)
    print(f"Executed {agent_action} with output: {output}")
    return {"intermediate_steps": [(agent_action, output)]}</pre>



<p>First, we create a new instance of the <code>ToolExecutor</code> class that we imported from LangGraph. This <code>ToolExecutor</code> is initialized by giving it our list of tools which includes two tools in this case. The <code>ToolExecutor</code> provides a prebuilt interface that will extract the function and arguments the agent wants to call from the <code>AgentAction</code> object and then call the function with the arguments so we don&#8217;t have to do this manually.</p>



<p>Then we define the <code>tool_executor_node</code> function which again is just a simple function with <code>input</code> (which will be the state object). We extract the <code>agent_action</code> from the input dictionary and then call the <code>invoke</code> method on the <code>tool_executor</code> object which will run whatever tool the agent wants to call for us.</p>



<p>We have a print statement just for our own visual feedback here, and then we return the <code>intermediate_steps</code> list with the <code>agent_action</code> and the output of the tool call. Notice that this is the intermediate steps list that we defined in the <code>AgentState</code> object and talked about earlier and will be added to whatever steps were already there.</p>



<figure class="wp-block-image size-large"><img decoding="async" src="https://academy.finxter.com/wp-content/uploads/2024/04/image-2-1024x717.png" alt="" class="wp-image-4364"/></figure>



<p>Now that we have these two functions for the nodes, we need a way to test if we want to finish the graph because the Agent Node has arrived at the final answer or if we need to continue on to the Executor node because it needs to execute a tool call. We can do this by defining a function that will check if the agent is done:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">def continue_or_end_test(data: AgentState):
    if isinstance(data["agent_outcome"], AgentFinish):
        return "END"
    else:
        return "continue"</pre>



<p>This function takes the <code>AgentState</code> object as input. Then it simply indexes into the <code>agent_outcome</code>. We said earlier that the <code>agent_outcome</code> will either be an <code>AgentAction</code> object (if still working) or an <code>AgentFinish</code> object if the agent is done. So if the <code>agent_outcome</code> is an instance of <code>AgentFinish</code> we return <code>"END"</code> to signal that the graph is done, otherwise, we return <code>"continue"</code> to signal that the graph should continue.</p>



<h3 class="wp-block-heading">Creating our Graph</h3>



<p>Now that we have two nodes and a test to see if we need to continue (this is just a very simple first example to explain the concepts), we can define our graph. The main type of graph in LangGraph is called a <code>StatefulGraph</code>, which passes a <code>state</code> object around as we discussed. Each node then returns some kind of update to that state, either setting specific attributes or adding to the existing attribute like the <code>intermediate_steps</code> list.</p>



<p>Setting up our graph is easy:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">workflow = StateGraph(AgentState)

workflow.add_node("agent", agent_node)
workflow.add_node("tool_executor", tool_executor_node)

workflow.set_entry_point("agent")</pre>



<p>First, we instantiate a new <code>StateGraph</code> passing in our <code>AgentState</code> object that we defined. We then simply add our two nodes, giving them a string name and passing in the functions we wrote second. Lastly, we set the entry point to the <code>agent</code> node, which is the first node that will be called when we start the graph.</p>



<figure class="wp-block-image size-full"><img decoding="async" src="https://academy.finxter.com/wp-content/uploads/2024/04/image-3.png" alt="" class="wp-image-4365"/></figure>



<p>Now we have a graph with an entry point. The next step is to define the connections called <code>edges</code> between the nodes. This is also very easy:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">workflow.add_edge("tool_executor", "agent")

workflow.add_conditional_edges(
    "agent", continue_or_end_test, {"continue": "tool_executor", "END": END}
)</pre>



<p>First, we add an edge from the <code>tool_executor</code> node back to the <code>agent</code> node. After we execute a tool call, we always want to feed the result back into the agent node.</p>



<p>Then we add a conditional edge from the <code>agent</code> node. We pass in our <code>continue_or_end_test</code> function that will determine where this edge will lead. If the function returns <code>"continue"</code> we will go to the <code>tool_executor</code> node, and if it returns <code>"END"</code> we will go to the <code>END</code> node. The <code>END</code> node is a special pre-built node that was part of our imports when we started this file.</p>



<p>Our simple graph in visual form now looks like this:</p>



<figure class="wp-block-image size-large"><img decoding="async" src="https://academy.finxter.com/wp-content/uploads/2024/04/image-4-1024x416.png" alt="" class="wp-image-4366"/></figure>



<p>Now that we have our graph defined, we need to take the final step which is to <code>compile</code> the graph before we can use it:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">weather_app = workflow.compile()</pre>



<h3 class="wp-block-heading">Testing our Graph</h3>



<p>Now let&#8217;s whip up a quick function to test our graph:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">def call_weather_app(query: str):
    inputs = {"input": query, "chat_history": []}
    output = weather_app.invoke(inputs)
    result = output.get("agent_outcome").return_values["output"]  # type: ignore
    steps = output.get("intermediate_steps")

    print(f"{Fore.BLUE}Result: {result}{Style.RESET_ALL}")
    print(f"{Fore.YELLOW}Steps: {steps}{Style.RESET_ALL}")

    return result</pre>



<p>The function will take a string query. As input, we need to define the <code>input</code> key with the query and an empty <code>chat_history</code> list as we don&#8217;t have a previous history for now. We then call <code>invoke</code> on the <code>weather_app</code> graph object and catch the output in a variable named <code>output</code>. The <code>agent_outcome</code> will have an <code>AgentFinish</code> which has the <code>return_values</code> attribute that holds the final answer as we discussed.</p>



<p><code># type: ignore</code> is just for the type checker here as it doesn&#8217;t know that <code>agent_outcome</code> will always be an <code>AgentFinish</code> object and I don&#8217;t want to go too far into type hinting in this tutorial. If you don&#8217;t use type checking you won&#8217;t need the comment. We also extract the <code>intermediate_steps</code> list from the output into a variable named <code>steps</code>.</p>



<p>When we started the file we imported <code>Fore</code> and <code>Style</code> from the <code>colorama</code> library. This library has already been installed as a dependency of something else, so we didn&#8217;t have to install it. The <code>Fore.BLUE</code> sets the text foreground color to blue and the <code>Style.RESET_ALL</code> resets the color back to the default, repeating the pattern on the next line with yellow for easy readability.</p>



<p>Now we can test our graph by calling the function with a query:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">call_weather_app("What is the weather in New York?")</pre>



<p>Go ahead and run this and you should see the final answer in blue:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">Result: The current weather in New York is sunny with a temperature of 35.1°F (1.7°C). The wind is coming from the north at 11.2 km/h. The humidity is at 52%, and
the visibility is 16.0 km.
Steps: All the steps here in yellow...</pre>



<p>Good! That worked. The steps are a bit hard to read, but that is what we have LangSmith for. Head over to https://smith.langchain.com/ and check out your trace under the project name of <code>LangGraph Basics</code>. Take the one named <code>LangGraph</code> as the <code>RunnableSequence</code> one is from when we did the partial test before we built our graph:</p>



<figure class="wp-block-image size-large"><img decoding="async" src="https://academy.finxter.com/wp-content/uploads/2024/04/image-5-1024x772.png" alt="" class="wp-image-4367"/></figure>



<p>We can see that the graph started with our <code>agent</code>, then went to the <code>tool_executor</code>, back to the agent, and then ended. Click on any of the steps to see more detail. Nice and readable right?</p>



<h3 class="wp-block-heading">Something a bit cooler!</h3>



<p>So let&#8217;s give our simple graph test here a bit of a bigger challenge! Comment out the old query and let&#8217;s ask something a bit harder:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""># call_weather_app("What is the weather in New York?")

call_weather_app("Give me a visual image displaying the current weather in Seoul, South Korea.")</pre>



<p>Let&#8217;s run this and see what we get (it should auto-save an image in the project&#8217;s <code>images</code> folder):</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">Result: Here is the visual image displaying the current weather in Seoul, South Korea:

![Seoul, South Korea Weather](c:\Coding_Vault\FINX_LANGGRAPH\images\152cf0e0-c50e-483b-be63-50ef40ea3255.png)</pre>



<figure class="wp-block-image size-full"><img decoding="async" src="https://academy.finxter.com/wp-content/uploads/2024/04/152cf0e0-c50e-483b-be63-50ef40ea3255.png" alt="" class="wp-image-4368"/></figure>



<p>That&#8217;s pretty good! It has the temperature and the rain. I can confirm that it is currently dark and rainy over here and this also corresponds to the weather data the API sent back. Pretty dang cool right!?</p>



<p>If we look at the LangSmith trace we&#8217;ll see exactly what we expect:</p>



<figure class="wp-block-image size-large"><img decoding="async" src="https://academy.finxter.com/wp-content/uploads/2024/04/image-6-1024x772.png" alt="" class="wp-image-4369"/></figure>



<p>The agent calls the weather function, it comes back to the agent which calls the image function, and then it ends by giving us the image. I&#8217;ll leave you to click on any of the steps if you want to see the in and outputs at each step.</p>



<p>Of course, we can put this information of wanting a visual image into the prompt so the user doesn&#8217;t have to type it and improve on this in many ways like directly displaying the image to the end user but that is not the point here, this is just a simple demonstration of how the edges and nodes come together to create a simple graph.</p>



<p>In the next part we&#8217;ll take this up a step. Where we basically have a single agent now, we&#8217;ll look at having a whole team of agents working together! I&#8217;ll see you in the next part!</p>



<p>P.S. I generated another one just for fun and it&#8217;s pretty good:</p>



<figure class="wp-block-image size-full"><img decoding="async" src="https://academy.finxter.com/wp-content/uploads/2024/04/8bd5c7bc-a13f-4ae0-b1eb-9ee853c1b2c7.png" alt="" class="wp-image-4370"/></figure>



<h2 class="wp-block-heading">Multi-Agent LangGraph Teams Preparation</h2>



<p>Hi and welcome back to part 4 of this tutorial series where we&#8217;ll once again be taking it up a step. We&#8217;ll basically compress the Agent and the Executor into a single node and then have multiple of these &#8216;agent and executor&#8217; nodes inside of a team working together. First, we&#8217;ll cover the basic idea and do some short work to prepare the extra functions we will need, and then we&#8217;ll continue into the next part where we&#8217;ll put it all together into a multi-agent team that does the work for us while we sit back and relax!</p>



<h3 class="wp-block-heading">Advantages of multi-agent teams</h3>



<p>So why is this multi-agent thing useful in the first place? We can simply give one agent multiple tools right? Well, up to a point. If you give a single agent a prompt to first do thing <code>A</code> by calling <code>function_a</code> and then do thing <code>B</code> by calling <code>function_b</code> followed by either <code>function_c</code> or <code>function_d</code> depending on the output of <code>function_b</code> then the prompt of this agent is going to become a mess and it will also be fairly unreliable. The main advantages of multi-agent teams for more complex setups are:</p>



<ul class="wp-block-list">
<li><strong>Grouping responsibilities</strong> gives better results as agents will tend to perform better when they have a more focused task rather than a dozen tools and responsibilities to choose from.</li>



<li><strong>Separate prompts</strong> will give better results as each prompt can have its own examples of exactly what we want it to do and how. We can even have a specific agent run on a fine-tuned version of ChatGPT that is specifically trained and optimized for that node&#8217;s task.</li>



<li><strong>Easier development</strong> as you can work on, test, and evaluate each agent in insolation without it being connected to and breaking stuff elsewhere in the chain when you make improvements. It&#8217;s also easier to conceptually wrap your brain around the system as a whole.</li>
</ul>



<p>There are many possible slight variations for how this could be implemented. You could have a shared <code>scratchpad</code> for example so that all of the agents can see what thought processes and work the other agents have done. The downside is that this is very verbose though and the amount of information exchanged may be pointlessly large.</p>



<p>Alternatively, you could have them be isolated as single LLM calls without a strong interconnection that basically operate independently but they are merely strung together in a chain. This may be a bit too isolated though.</p>



<p>The example we&#8217;ll be looking at here lies somewhere in the middle where we will have independent fully-fledged agents that have their own scratchpad and ability to call tools if needed but the result of each agent doing its independent work gets stored in a shared state object like we had in the previous part.</p>



<p>This will be supervised by a sort of &#8216;team supervisor&#8217; node we&#8217;ll call an &#8216;agent supervisor&#8217; that will use this overall state object with the work done so far to decide what happens next and who to call. The basic idea looks like this:</p>



<figure class="wp-block-image size-large"><img decoding="async" src="https://academy.finxter.com/wp-content/uploads/2024/04/image-8-1024x702.png" alt="" class="wp-image-4374"/></figure>



<p>The user sends a query to the Team Supervisor. The Team Supervisor then has a team of agents and it decides who it should call on next to complete some work, it can choose any of the agents at any point. Every agent points back to the Team Supervisor so that the Team Supervisor gets to decide again after each step which agent is next or if the work has been completed, in which case it will return to the end user.</p>



<p>Ours will look slightly different but we&#8217;ll build a diagram for it as we go along.</p>



<h3 class="wp-block-heading">Tavily API</h3>



<p>Before we jump in we&#8217;ll need to add another API key to our <code>.env</code> and <code>setup_environment.py</code> files. We will be using the Tavily API lightly during this part and again in the next part of the series. Go to https://app.tavily.com/ and sign up for a free API key.</p>



<p>Tavily is a search engine optimized for AI agents and we can use it to have an agent search the internet. One of the reasons I chose Tavily here is that LangChain comes with pre-built tools for Tavily that we can just import and use as is, allowing us to focus more on learning about LangGraph as we have one less tool to write. You can just use your Google account for quick and easy sign up and it will cost you nothing for the first 1000 or so queries which is way more than we&#8217;ll use. Get your API key and copy it to the clipboard. Then open your <code>.env</code> file and add it like so:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">OPENAI_API_KEY=your_api_key_here
LANGCHAIN_API_KEY=your_api_key_here
WEATHER_API_KEY=your_api_key_here
TAVILY_API_KEY=your_api_key_here</pre>



<p>Make sure not to use any spaces or quotation marks as usual. Then go ahead and save and close the <code>.env</code> file. Now open the <code>setup_environment.py</code> file and add a single tine to load the <code>TAVILY_API_KEY</code> to an environment variable like so:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">import os
from datetime import date

from decouple import config


def set_environment_variables(project_name: str = "") -> None:
    if not project_name:
        project_name = f"Test_{date.today()}"

    os.environ["OPENAI_API_KEY"] = str(config("OPENAI_API_KEY"))

    os.environ["LANGCHAIN_TRACING_V2"] = "true"
    os.environ["LANGCHAIN_API_KEY"] = str(config("LANGCHAIN_API_KEY"))
    os.environ["LANGCHAIN_PROJECT"] = project_name

    ##### Add only this line #####
    os.environ["TAVILY_API_KEY"] = str(config("TAVILY_API_KEY"))
    ##############################

    print("API Keys loaded and tracing set with project name: ", project_name)</pre>



<p>Now save and close the <code>setup_environment.py</code> file.</p>



<h3 class="wp-block-heading">Prep for our multi-agent team</h3>



<p>For this example over the next two parts, we will be creating a multi-agent team that will generate travel itineraries for us in PDF format, with us simply inputting a query and getting a fully formed PDF travel itinerary out the other end including an image. We will have three different tools that we will need for the overall setup:</p>



<ol class="wp-block-list">
<li><strong>An image generator</strong>: We already made one in the last part, so we can just import and reuse it, which is one of the nice things about LangChain tools.</li>



<li><strong>An internet search tool</strong>: In case the agent wants to search for more information. LangChain comes with some pre-built tools one of which is for Tavily Search, which is why we got the API key. We can just use this prebuilt here to save some time.</li>



<li><strong>A PDF generator</strong>: We will need a tool for our agents to be able to write PDF files and save them to disk. We will have to write this one ourselves before we can get started on our travel itinerary multi-agent team setup.</li>
</ol>



<h3 class="wp-block-heading">PDF writing tool</h3>



<p>So let&#8217;s write up a quick PDF writing tool for our agents before we move on. Inside your <code>tools</code> folder make a new file named <code>pdf.py</code>:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c2.png" alt="📂" class="wp-smiley" style="height: 1em; max-height: 1em;" /> FINX_LANGGRAPH
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c2.png" alt="📂" class="wp-smiley" style="height: 1em; max-height: 1em;" /> images
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c2.png" alt="📂" class="wp-smiley" style="height: 1em; max-height: 1em;" /> tools
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> __init__.py
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> image.py
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> pdf.py    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/2728.png" alt="✨" class="wp-smiley" style="height: 1em; max-height: 1em;" />New file
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> weather.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> .env
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> langchain_basics.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> Pipfile
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> Pipfile.lock
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> setup_environment.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> simple_langgraph.py</pre>



<p>Inside this new <code>pdf.py</code> file get started with our imports:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">import os
import uuid
from pathlib import Path

import pdfkit
from langchain.tools import tool
from markdown import markdown
from pydantic import BaseModel, Field</pre>



<p>We import <code>os</code> to work with the operating system, <code>uuid</code> to generate unique filenames again, and <code>Path</code> to create a path towards an output folder to save our PDF files. The <code>tool</code> decorator from LangChain is the same one that we used last time and the <code>Basemodel</code> and <code>Field</code> from <code>pydantic</code> imports are for defining the input arguments interface for our function just like we did before.</p>



<p>The <code>pdfkit</code> library is going to let us save HTML to real output PDF files, but the downside is that it needs HTML as input to do the conversion. As HTML is more complex for our LLM agents to write which introduces more variables and I want to keep this example simple we will be using the <code>markdown</code> library to convert markdown to HTML for us. That way we can just tell our agents to write in markdown formatting (which is very simple) and our function will do <code>markdown</code> -&gt; <code>HTML</code> -&gt; <code>PDF</code>.</p>



<p>Both <code>pdfkit</code> and <code>markdown</code> are not installed by default so we will have to install them in our virtual environment. Open your terminal and run:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">pipenv install markdown==3.6 pdfkit==1.0.0</pre>



<p>That will take care of the basic Python library installs, but <code>pdfkit</code> needs an additional step, as it actually uses something called <code>wkhtmltopdf</code> under the hood to achieve the conversion. Head over to https://wkhtmltopdf.org/downloads.html and click the appropriate download for your platform. I am on Windows so I&#8217;ll select the Windows 64-bit download option:</p>



<figure class="wp-block-image size-large"><img decoding="async" src="https://academy.finxter.com/wp-content/uploads/2024/04/image-1-1-1024x726.png" alt="" class="wp-image-4375"/></figure>



<p>Run the installer and select an install location. I&#8217;ll simply use the default <code>C:\Program Files\wkhtmltopdf</code> myself. Whichever install location you choose, take note of it and copy it somewhere as you will need it in a moment:</p>



<figure class="wp-block-image size-full"><img decoding="async" src="https://academy.finxter.com/wp-content/uploads/2024/04/image-2-1.png" alt="" class="wp-image-4376"/></figure>



<p>Let that run the install and when it&#8217;s done we can get back to the code! Below our imports in <code>pdf.py</code> we&#8217;ll add some quick setup:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">PATH_WKHTMLTOPDF = r"C:\Program Files\wkhtmltopdf\bin\wkhtmltopdf.exe"
PDFKIT_CONFIG = pdfkit.configuration(wkhtmltopdf=PATH_WKHTMLTOPDF)

OUTPUT_DIRECTORY = Path(__file__).parent.parent / "output"</pre>



<p>First of all, we do some setup for <code>pdfkit</code> by pointing it to the location of the <code>wkhtmltopdf</code> executable. This is the path I used on my Windows machine, you have to adjust this path to where you installed <code>wkhtmltopdf</code> on your machine so be sure that you use the correct path for you! After defining the path we can simply call <code>pdfkit.configuration</code> with the <code>wkhtmltopdf</code> argument set to the path we just defined. Later in the code when we actually write the PDF files, we can pass in this <code>PDFKIT_CONFIG</code> as an argument to use this configuration.</p>



<p>We then use the same trick as last time to get a path to a folder named <code>output</code> in our project root. This is where we will save our PDF files, but the folder doesn&#8217;t exist yet. Make sure you create it right now or the code will fail when it tries to save the PDF files later and you&#8217;ll be stuck debugging why it doesn&#8217;t work:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c2.png" alt="📂" class="wp-smiley" style="height: 1em; max-height: 1em;" /> FINX_LANGGRAPH
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c2.png" alt="📂" class="wp-smiley" style="height: 1em; max-height: 1em;" /> images
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c2.png" alt="📂" class="wp-smiley" style="height: 1em; max-height: 1em;" /> output    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/2728.png" alt="✨" class="wp-smiley" style="height: 1em; max-height: 1em;" />New empty folder
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c2.png" alt="📂" class="wp-smiley" style="height: 1em; max-height: 1em;" /> tools
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> __init__.py
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> image.py
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> pdf.py
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> weather.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> .env
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> langchain_basics.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> Pipfile
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> Pipfile.lock
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> setup_environment.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> simple_langgraph.py</pre>



<p>Good! Now back to our <code>pdf.py</code> file. Below the setup we&#8217;ll define our input arguments interface just like we did with our other tools so far:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">class MarkdownToPDFInput(BaseModel):
    markdown_text: str = Field(
        description="Markdown text to convert to PDF, provided in valid markdown format."
    )</pre>



<p>We simply define the input arguments as a single string that has to be in a valid markdown format. Once again make sure your description is a good one as the LLM will use it, it is not just for our own reference.</p>



<h3 class="wp-block-heading">HTML generation</h3>



<p>Let&#8217;s make the problem we need to solve smaller by first writing a separate function to generate the HTML from the markdown text so we can just feed HTML into <code>pdfkit</code>:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">def generate_html_text(markdown_text: str) -> str:
    """Convert markdown text to HTML text."""
    markdown_text = markdown_text.replace("file:///", "").replace("file://", "")
    html_text = markdown(markdown_text)
    html_text = f"""
    &lt;html>
    &lt;head>
        &lt;style>
            @import url('https://fonts.googleapis.com/css2?family=Roboto&amp;display=swap');
            body {{
                font-family: 'Roboto', sans-serif;
                line-height: 150%;
            }}
        &lt;/style>
    &lt;/head>
    &lt;body>
    {html_text}
    &lt;/body>
    &lt;/html>
    """
    return html_text</pre>



<p>This function takes a <code>markdown_text</code> as string input. First, we&#8217;ll search the markdown text for any <code>file:///</code> or <code>file://</code> protocol declarations sometimes used when the model inserts our image in markdown. These are not needed so we simply replace them with an empty string <code>""</code> as these would cause our image to not show up in the final generated PDF file. This kind of thing is something you just discover during your development work.</p>



<p>Now we can simply call the <code>markdown</code> function we imported on our markdown to get valid HTML based on the markdown. As I felt like doing some light styling I then wrapped the <code>html_text</code> in some basic HTML tags <code>html</code>, <code>head</code>, and <code>body</code>. In the <code>head</code> we can then include a <code>style</code> tag which allows us to load the <code>Roboto</code> font from Google using the <code>css</code> function <code>@import url</code>, set it as the font, and give some extra line height to our document to make the text more readable. This is the final <code>html_text</code> that will be returned with the markdown call converted HTML in the <code>body</code> portion. If you happen to be less familiar with HTML just copy what I have, it&#8217;s not really important for the course.</p>



<h3 class="wp-block-heading">Finishing up the tool</h3>



<p>Now it&#8217;s time to define the actual tool itself. Continue below:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">@tool("markdown_to_pdf_file", args_schema=MarkdownToPDFInput)
def markdown_to_pdf_file(markdown_text: str) -> str:
    """Convert markdown text to a PDF file. Takes valid markdown as a string as input and will return a string file-path to the generated PDF."""
    html_text = generate_html_text(markdown_text)
    unique_id: uuid.UUID = uuid.uuid4()
    pdf_path = OUTPUT_DIRECTORY / f"{unique_id}.pdf"

    options = {
        "no-stop-slow-scripts": True,
        "print-media-type": True,
        "encoding": "UTF-8",
        "enable-local-file-access": "",
    }

    pdfkit.from_string(
        html_text, str(pdf_path), configuration=PDFKIT_CONFIG, options=options
    )

    if os.path.exists(pdf_path):
        return str(pdf_path)
    else:
        return "Could not generate PDF, please check your input and try again."</pre>



<p>We start with the <code>@tool</code> decorator, once again providing a string name for our function and then the input argument interface we defined. The function itself takes a <code>markdown_text</code> as input and returns a string file path to the generated PDF file. We have a docstring that explains what the function does and what it expects as input as the LLM is going to use this.</p>



<p>We then call our <code>generate_html_text</code> function on the <code>markdown_text</code> to get the <code>html_text</code> we need and generate a unique ID for the PDF file name, creating a path to the PDF file in our <code>OUTPUT_DIRECTORY</code> folder. We then define some options for <code>pdfkit</code> to use when generating the PDF. These are just some basic options that I found to work ok for our example, we don&#8217;t want to get sidetracked here by spending too much time on this as it is not the focus of this tutorial.</p>



<p>Finally, we call <code>pdfkit.from_string</code> with the <code>html_text</code>, the path to the PDF file in <code>str</code> format instead of a <code>Path</code> object, the <code>configuration</code> we set up atop this file, and the <code>options</code> we just defined. If the PDF file is successfully generated, which we can check with the <code>os.path.exists</code> function to see if the file exists or not, we return the path to the PDF file. If it does not exist we return a message saying that the PDF could not be generated. We purposely do not raise an error but send a string response as the agent can receive this, try to find the error, fix it, and try again.</p>



<h3 class="wp-block-heading">PDF tool test run</h3>



<p>Now let&#8217;s add a quick test at the bottom of our file:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">markdown_dummy_text = """
# Title
This is a test of the markdown to PDF function.
## Subtitle
This is a test of the markdown to PDF function.
### Sub-subtitle
This is a test of the markdown to PDF function. This is a paragraph with random text in it nunc nunc tincidunt nunc, nec.
S'il vous plaît.
"""

if __name__ == "__main__":
    print(markdown_to_pdf_file(markdown_dummy_text))</pre>



<p>There are a couple of headings here and some French with non-standard characters like in &#8220;plaît&#8221; to make sure it also works with special characters. Now go ahead and run your file (Reminder: make sure you created the <code>output</code> folder!). Close the printer message popup if you get one, we&#8217;ll just ignore it for now. You should see a new PDF file in your <code>output</code> folder. Go ahead and open it:</p>



<figure class="wp-block-image size-large"><img decoding="async" src="https://academy.finxter.com/wp-content/uploads/2024/04/image-3-1-1024x771.png" alt="" class="wp-image-4377"/></figure>



<p>It&#8217;s not perfect by any means, but it works well enough for our LangGraph example purposes. As LangGraph is the focus here we will not spend any more time perfecting the details of this particular tool.</p>



<p>One last step though to fix the imports. Open up the <code>tools/__init__.py</code> file and fix the code to:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">from .image import generate_image
from .weather import get_weather
from .pdf import markdown_to_pdf_file</pre>



<p>Save and close that so we can have the nicer imports in our main code. That&#8217;s it for the preparation, this part is slightly shorter by design as the next one will be extra long. It is finally time to set up and run our multi-agent team! So let&#8217;s get to the fun stuff, I&#8217;ll see you there! <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f680.png" alt="🚀" class="wp-smiley" style="height: 1em; max-height: 1em;" /></p>



<h2 class="wp-block-heading">Multi-Agent LangGraph: Setting Up Our Multi-Agent Team</h2>



<p>Welcome back to part 5, where we&#8217;ll set up our multi-agent team. So buckle up and let&#8217;s jump right in. Create a new file named <code>multi_agent.py</code> in your project root:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c2.png" alt="📂" class="wp-smiley" style="height: 1em; max-height: 1em;" /> FINX_LANGGRAPH
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c2.png" alt="📂" class="wp-smiley" style="height: 1em; max-height: 1em;" /> images
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c2.png" alt="📂" class="wp-smiley" style="height: 1em; max-height: 1em;" /> output
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c2.png" alt="📂" class="wp-smiley" style="height: 1em; max-height: 1em;" /> tools
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> __init__.py
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> image.py
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> pdf.py
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> weather.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> .env
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> langchain_basics.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> multi_agent.py    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/2728.png" alt="✨" class="wp-smiley" style="height: 1em; max-height: 1em;" />New file
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> Pipfile
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> Pipfile.lock
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> setup_environment.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> simple_langgraph.py</pre>



<p>Open up the <code>multi_agent.py</code> file and start with the imports:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">import functools
import operator
from typing import Annotated, Sequence, TypedDict

from colorama import Fore, Style
from langchain.agents import AgentExecutor, create_openai_tools_agent
from langchain.output_parsers.openai_functions import JsonOutputFunctionsParser
from langchain_community.tools.tavily_search import TavilySearchResults
from langchain_core.messages import BaseMessage, HumanMessage
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.language_models.chat_models import BaseChatModel
from langchain_openai import ChatOpenAI
from langgraph.graph import END, StateGraph

from setup_environment import set_environment_variables
from tools import generate_image, markdown_to_pdf_file</pre>



<p>We have a lot of imports again, many of which will be familiar. We import our own two tools from the <code>tools</code> folder and also the <code>TavilySearchResults</code> from the <code>langchain_community</code> tools. There are some new imports like <code>functools</code> and the <code>AgentExecutor</code> but we&#8217;ll cover each one and how they are used as we go along.</p>



<h3 class="wp-block-heading">Environment variables and constants</h3>



<p>Let&#8217;s load up our environment variables and create a bunch of constants we&#8217;ll need:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">set_environment_variables("Multi_Agent_Team")

TRAVEL_AGENT_NAME = "travel_agent"
LANGUAGE_ASSISTANT_NAME = "language_assistant"
VISUALIZER_NAME = "visualizer"
DESIGNER_NAME = "designer"

TEAM_SUPERVISOR_NAME = "team_supervisor"
MEMBERS = [TRAVEL_AGENT_NAME, LANGUAGE_ASSISTANT_NAME, VISUALIZER_NAME]
OPTIONS = ["FINISH"] + MEMBERS</pre>



<p>We load our environment variables and set the project name to <code>Multi_Agent_Team</code>. We then define a bunch of constants for the names of our agents and the team supervisor. These are just strings but as we&#8217;ll have to type each of these strings multiple times it will be very annoying if we change or mistype one, hence storing these in a single place up top is the way to go.</p>



<p>Note that we have the <code>travel_agent</code>, <code>language_assistant</code>, and <code>visualizer</code> inside a list called <code>members</code> and we have the <code>designer</code> and <code>team_supervisor</code> on the outside. We also imported the <code>END</code> node we used last time. That leaves us with a situation like this:</p>



<figure class="wp-block-image size-large"><img decoding="async" src="https://academy.finxter.com/wp-content/uploads/2024/04/image-5-1-1024x842.png" alt="" class="wp-image-4381"/></figure>



<p>The list named <code>OPTIONS</code> is going to be the potential options the <code>team_supervisor</code> can choose from each step along the way, so it has all three members in the team + the <code>"FINISH"</code> option to indicate this particular team has finished its work.</p>



<p>Add two more final constants below:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">TAVILY_TOOL = TavilySearchResults()
LLM = ChatOpenAI(model="gpt-3.5-turbo-0125")</pre>



<p>We have the <code>TAVILY_TOOL</code> which is the Tavily search tool we imported from the <code>langchain_community</code> tools and the <code>LLM</code> which is <code>gpt-3.5-turbo-0125</code> here but feel free to use GPT-4-turbo instead if you want.</p>



<h3 class="wp-block-heading">Agent creator function</h3>



<p>We&#8217;re going to be creating a lot of agents here, so let&#8217;s create a function to handle the repetitive work of creating an agent:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">def create_agent(llm: BaseChatModel, tools: list, system_prompt: str):
    prompt_template = ChatPromptTemplate.from_messages(
        [
            ("system", system_prompt),
            MessagesPlaceholder(variable_name="messages"),
            MessagesPlaceholder(variable_name="agent_scratchpad"),
        ]
    )
    agent = create_openai_tools_agent(llm, tools, prompt_template)
    agent_executor = AgentExecutor(agent=agent, tools=tools)  # type: ignore
    return agent_executor</pre>



<p>We define a function named <code>create_agent</code> which takes an <code>llm</code> of the type <code>BaseChatModel</code>. This is just a type hint but it was part of our imports for clarity. <code>BaseChatModel</code> is the base class for all chat models in LangChain, including the <code>ChatOpenAI</code> variation we use here. You can pass any LLM you want and have different nodes of the same graph run on completely different LLMs. The other arguments are a list of <code>tools</code> and a <code>system_prompt</code> string.</p>



<p>We then declare a <code>prompt_template</code> using the <code>ChatPromptTemplate.from_messages</code> method that we used all the way back in part 1, but this time we use multiple messages. We have a <code>"system"</code> message that is the system prompt string passed into the function and then we have two placeholders for the <code>messages</code> and <code>agent_scratchpad</code> variables that we have seen before. The <code>MessagesPlaceholder</code>, as the name suggests, is just a placeholder for both of these so we can insert them later using the names we have defined under <code>variable_name</code>.</p>



<p>We then use the <code>create_openai_tools_agent</code> just like we did in part 3, but this time we go one step further and create an <code>AgentExecutor</code> in the step below. This <code>AgentExecutor</code> comes with LangChain and will basically combine the agent and the executor nodes we had in the previous part into a single node, handling the function call logic we did in the previous part for us! It takes an <code>agent</code> and a list of <code>tools</code> for that agent to use as arguments.</p>



<p>The <code># type: ignore</code> comment is in case you use a type checker as it will complain here, and this series is not about type checking so we won&#8217;t go too deep into it as it&#8217;s no big deal here. We then return the <code>agent_executor</code> we created.</p>



<h3 class="wp-block-heading">Agent state object</h3>



<p>Now let&#8217;s declare the state object that we will be passing around in this particular graph:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">class AgentState(TypedDict):
    messages: Annotated[Sequence[BaseMessage], operator.add]
    next: str</pre>



<p>This time we need two entries. The first is the <code>messages</code> which is a sequence of <code>BaseMessage</code> objects which again are just messages like <code>("human", "Hello, how are you doing?"),</code> or <code>("ai", "I'm doing well, thanks!"),</code>. We define it as a <code>Sequence</code>, so like a list or a tuple of these messages, and the <code>operator.add</code> again indicates that we will add to this sequence of messages with each step. <code>Annotated</code> is just used as it allows us to add the annotation of <code>operator.add</code>.</p>



<p>The second entry is the <code>next</code> which is a string that will be the name of the next agent to call. This is the agent that the <code>team_supervisor</code> will decide to call next based on the state object it receives and then we can use this field to see which agent to route to next. This field can just be overwritten as we don&#8217;t need the history, so a single string without any fancy annotations will do fine here.</p>



<h3 class="wp-block-heading">Agent node function</h3>



<p>Now let&#8217;s define a function that represents one of these agent nodes in general:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">def agent_node(state, agent, name):
    result = agent.invoke(state)
    return {"messages": [HumanMessage(content=result["output"], name=name)]}</pre>



<p>The function takes the <code>state</code> object, an agent, and the string name for the agent (the ones we defined up top as constants). Then we simply need to <code>invoke</code> the agent with the <code>state</code> and then keeping with the promise we made above in the <code>AgentState</code> object we defined the node needs to return a <code>messages</code> object with a message in it. We will simply use a <code>HumanMessage</code>, as it doesn&#8217;t really matter who the message comes from, and get the result from <code>result["output"]</code> which is the output of the agent&#8217;s call.</p>



<h3 class="wp-block-heading">Team supervisor&#8217;s next member choice</h3>



<p>Next, we&#8217;re going to need a way to have the team_supervisor choose which agent to invoke next. The easiest way to do this reliably is to pretend this is a function that the agent supervisor has to call for us. The only possible input arguments are the names of our agents and we tell the team_supervisor that it must call nonexistent_function(agent_name) to invoke the agent.</p>



<p>This is a bit of a hack, but it makes it very easy for us to extract the agent_name consistently and easily to see which agent node needs to run next. We will also include one extra option of &#8220;FINISH&#8221; so the team_supervisor can tell us when it&#8217;s done and needs to break out of the team. Doing this will also let us use the <code>JsonOutputFunctionsParser</code> later on in our code, as the function call will be sent in a correct JSON format, making the parsing of the output easier.</p>



<p>For this function that doesn&#8217;t actually exist, we&#8217;re going to define an old-school vanilla OpenAI function description that describes how the function works to the LLM team supervisor. Add the following variable:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">router_function_def = {
    "name": "route",
    "description": "Select the next role.",
    "parameters": {
        "title": "routeSchema",
        "type": "object",
        "properties": {
            "next": {
                "title": "next",
                "anyOf": [
                    {"enum": OPTIONS},
                ],
            }
        },
        "required": ["next"],
    },
}</pre>



<p>This is actually JSON Schema vocabulary, but is quite readable. We define the name of the function as <code>route</code> and give it a description of what the function does. We then define the <code>parameters</code> that the function takes, giving the parameter object a title of <code>routeSchema</code> and defining that it is an <code>object</code>. Then we define the <code>properties</code> of this object, which is just a single property named <code>next</code>. This property has a title of <code>next</code> and the options available are <code>anyOf</code> the enumerate (list) of <code>OPTIONS</code> we defined up top. We then define that the <code>next</code> property is <code>required</code>.</p>



<p>This JSON Schema style is what the OpenAI API normally uses for function/tool calls, but LangChain has done this under the hood for the functions we have used so far. Again, this function will not actually exist, but that doesn&#8217;t stop us from feeding it to the LLM and extracting the <code>next</code> property from the arguments the LLM provides for us.</p>



<h3 class="wp-block-heading">Team supervisor system prompt</h3>



<p>Now let&#8217;s create a secondary file to store our prompt system setup messages as we&#8217;re going to be using quite a lot of them here. Create a new file named <code>multi_agent_prompts.py</code> in your project root:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c2.png" alt="📂" class="wp-smiley" style="height: 1em; max-height: 1em;" /> FINX_LANGGRAPH
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c2.png" alt="📂" class="wp-smiley" style="height: 1em; max-height: 1em;" /> images
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c2.png" alt="📂" class="wp-smiley" style="height: 1em; max-height: 1em;" /> output
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c2.png" alt="📂" class="wp-smiley" style="height: 1em; max-height: 1em;" /> tools
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> __init__.py
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> image.py
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> pdf.py
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> weather.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> .env
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> langchain_basics.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> multi_agent.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> multi_agent_prompts.py    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/2728.png" alt="✨" class="wp-smiley" style="height: 1em; max-height: 1em;" />New file
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> Pipfile
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> Pipfile.lock
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> setup_environment.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> simple_langgraph.py</pre>



<p>We&#8217;ll use this file to store the prompt string variables for the system messages our agents will use. If you&#8217;re watching the video tutorial version of this please be advised that there is a written blog version of this tutorial where you can copy these prompts so you don&#8217;t have to type them all over again, as we have a lot more of them coming. Let&#8217;s start with the team supervisor. Inside the <code>multi_agent_prompts.py</code> file add:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">TEAM_SUPERVISOR_SYSTEM_PROMPT = """
You are a supervisor tasked with managing a conversation between the following workers: {members}. Given the following user request, respond with the worker to act next. Each worker will perform a task and respond with their results and status. The end goal is to provide a good travel itinerary for the user, with things to see and do, practical tips on how to deal with language difficulties, and a nice visualization that goes with the travel plan (in the form of an image path, the visualizer will save the image for you and you only need the path).

Make sure you call on each team member ({members}) at least once. Do not call the visualizer again if you've already received an image file path. Do not call any team member a second time unless they didn't provide enough details or a valid response and you need them to redo their work. When finished, respond with FINISH, but before you do, make sure you have a travel itinerary, language tips for the location, and an image file-path. If you don't have all of these, call the appropriate team member to get the missing information.
"""</pre>



<p>So we have some basic instructions for the team supervisor on how to manage the team here. We have the placeholder <code>{members}</code> in there twice which will be replaced with the actual list of members. We tell it we want a travel itinerary with things to do and sightseeing, language tips, and a visualization for the itinerary. The prompt here is far from perfect and you can tweak it further if you like.</p>



<p>Save the <code>multi_agent_prompts.py</code> file and let&#8217;s get back to the <code>multi_agent.py</code> file. First of all, add an extra import up top with the other imports:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">#... all the other imports ...

from multi_agent_prompts import TEAM_SUPERVISOR_SYSTEM_PROMPT</pre>



<p>Note that we could just use <code>from multi_agent_prompts import *</code> as the <code>*</code> will simply import everything from the file, even the variables we add later, but this is a bad practice as it makes it hard to see where the variables come from and leads to namespace pollution. It&#8217;s better to explicitly define and keep track of what you&#8217;re importing or sooner or later you&#8217;re going to have multiple variables with the same name and you won&#8217;t know where they come from.</p>



<h3 class="wp-block-heading">Team supervisor prompt template</h3>



<p>Now scroll all the way back down past the <code>router_function_def</code> and add the following code to define our team supervisor&#8217;s prompt template manually as it will be different from all the other agents:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">team_supervisor_prompt_template = ChatPromptTemplate.from_messages(
    [
        ("system", TEAM_SUPERVISOR_SYSTEM_PROMPT),
        MessagesPlaceholder(variable_name="messages"),
        (
            "system",
            "Given the conversation above, who should act next?"
            " Or should we FINISH? Select one of: {options}",
        ),
    ]
).partial(options=", ".join(OPTIONS), members=", ".join(MEMBERS))</pre>



<p>We use the same <code>ChatPromptTemplate.from_messages</code> method we used before, but this time we have three messages. The first is the <code>TEAM_SUPERVISOR_SYSTEM_PROMPT</code> we defined in the <code>multi_agent_prompts.py</code> file. The second is a <code>MessagesPlaceholder</code> for the <code>messages</code> variable and the third is a short system message that reminds the team supervisor what it&#8217;s task is and what options it has available to choose from.</p>



<p>This team supervisor prompt template will need 3 variables to be filled in and used properly.</p>



<ul class="wp-block-list">
<li>The first is inside the <code>TEAM_SUPERVISOR_SYSTEM_PROMPT</code> where we used the <code>members</code> placeholder twice.</li>



<li>The second one is the <code>messages</code> for the <code>MessagesPlaceholder</code> in the middle.</li>



<li>The third is the <code>options</code> for the <code>options</code> placeholder in the last message.</li>
</ul>



<p>We have two of these available, namely the <code>options</code> and the <code>members</code>, but we don&#8217;t have the <code>messages</code> yet. the <code>.partial</code> chained on method will let us fill in the two parts that we have and leave the <code>messages</code> part to be added later, so we can go ahead and pass our <code>OPTIONS</code> to the <code>options</code> placeholder and the <code>MEMBERS</code> to the <code>members</code> placeholder ahead of time using this <code>partial</code> filling in method.</p>



<p>Note that we use the <code>join</code> method on the <code>OPTIONS</code> and <code>MEMBERS</code> lists to turn them into a single string with the members separated by a comma and a space as we cannot pass list variables to LLMs.</p>



<h3 class="wp-block-heading">Team supervisor node</h3>



<p>So the team supervisor is basically going to act like a router between our agents, deciding who is up next. Remember in part 1 where we used <code>LCEL</code> with the <code>|</code> pipe operator to create chains by piping a prompt into an LLM and then into an output parser? These simple vanilla LangChain chains can also be used as nodes in LangGraph. As the team supervisor node is going to be special we will use our part 1 vanilla LangChain knowledge to simply chain it together manually:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">team_supervisor_chain = (
    team_supervisor_prompt_template
    | LLM.bind_functions(functions=[router_function_def], function_call="route")
    | JsonOutputFunctionsParser()
)</pre>



<p>So we simply define the <code>team_supervisor_chain</code> as the prompt template we just made for it, then we pipe that into the <code>LLM</code>, and pipe that into a <code>JsonOutputFunctionsParser</code>. As we&#8217;re using a function here we can use the JSON output parser to extract the <code>next</code> property from the arguments the LLM provides for us.</p>



<p>The LLM here uses the <code>bind_functions</code> method to bind the <code>router_function_def</code> JSON Schema we defined as the available <code>functions</code> for this LLM call, and by passing in the second optional argument <code>function_call="route"</code> we tell the LLM that it MUST call the <code>route</code> function we defined earlier, meaning we are actually forcing it to call this function and not do anything else as this is its only purpose. Remember we added an entry in the <code>AgentState</code> to store the <code>next</code> parameter.</p>



<h3 class="wp-block-heading">The system prompts for our other agents</h3>



<p>Ok, now we need to create the agents that will make up the rest of our graph. These are going to be a lot easier as we&#8217;ll be able to use the <code>create_agent</code> function we wrote earlier. But first, we need some system setups which are going to be unique for each agent. Let&#8217;s move back over to the <code>multi_agent_prompts.py</code> file and add the following below the existing <code>TEAM_SUPERVISOR_SYSTEM_PROMPT</code>, starting with the travel agent:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">TRAVEL_AGENT_SYSTEM_PROMPT = """
You are a helpful assistant that can suggest and review travel itinerary plans, providing critical feedback on how the trip can be enriched for enjoyment of the local culture. If the plan already includes local experiences, you can mention that the plan is satisfactory, with rationale.

Assume a general interest in popular tourist destinations and local culture, do not ask the user any follow-up questions.

You have access to a web search function for additional or up-to-date research if needed. You are not required to use this if you already have sufficient information to answer the question.
"""</pre>



<p>So we just have some basic instructions here, and notice how we say that if the plan already includes local experiences the agent can mention that the plan is satisfactory already, to make sure we&#8217;re not forcing it to do pointless work. The second paragraph is to stop it from asking questions and expecting an answer from the user, it should just help us without asking stuff.</p>



<p>Finally, we tell it that we give it access to a web search function to do more research if it needs to, but it won&#8217;t use these much as it has most travel info hard-wired into the LLM already. (We&#8217;ll use these search functions more extensively in the last part). I&#8217;ve taken some inspiration for these agents and prompts from the <code>Autogen</code> demo agents here, but this is just a starting point, and these can be tweaked much further.</p>



<p>Now for the language assistant:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">LANGUAGE_ASSISTANT_SYSTEM_PROMPT = """
You are a helpful assistant that can review travel plans, providing feedback on important/critical tips about how best to address language or communication challenges for the given destination. If the plan already includes language tips, you can mention that the plan is satisfactory, with rationale.

You have access to a web search function for additional or up-to-date research if needed. You are not required to use this if you already have sufficient information to answer the question.
"""</pre>



<p>This is basically the same but with a focus on language tips instead of travel itinerary plans. Let&#8217;s move on to the visualizer:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">VISUALIZER_SYSTEM_PROMPT = """
You are a helpful assistant that can generate images based on a detailed description. You are part of a travel agent team and your job is to look at the location and travel itinerary and then generate an appropriate image to go with the travel plan. You have access to a function that will generate the image as long as you provide a good description including the location and visual characteristics of the image you want to generate. This function will download the image and return the path of the image file to you.

Make sure you provide the image, and then communicate back as your response only the path to the image file you generated. You do not need to give any other textual feedback, just the path to the image file.
"""</pre>



<p>This one is a bit different as it&#8217;s going to generate an image for us. We tell it that it should only provide the path to the image file and not any other feedback. This is of course because the image generation tool that we wrote ourselves will save the image to disk and return the path to the image file, so we don&#8217;t need any other feedback from the agent other than the path which means that the image generation was successful.</p>



<p>Now we have one last agent&#8217;s system prompt to define, the designer, which is going to exist outside of our team of three agents above. We will also need the path to the <code>images</code> folder in our project to insert into this prompt. First scroll all the way back up to the top of the <code>multi_agent_prompts.py</code> file, and add the following import:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">from tools.image import IMAGE_DIRECTORY</pre>



<p>Now scroll all the way back down again and add the designer&#8217;s system prompt, this time using a multi-line <code>f</code> string:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">DESIGNER_SYSTEM_PROMPT = f"""
You are a helpful assistant that will receive a travel itinerary in parts. Some parts will be about the travel itinerary and some will be the language tips, and you will also be given the file path to an image. Your job is to call the markdown_to_pdf_file function you have been given, with the following argument:

markdown_text: A summary of the travel itinerary and language tips, with the image inserted, all in valid markdown format and without any duplicate information.

Make sure to use the following structure when inserting the image:
![Alt text]({str(IMAGE_DIRECTORY)}/image_name_here.png) using the correct file path. Make sure you don't add any stuff like 'file://'.

Start with the image and itinerary first and the language tips after, creating a neat and organized final travel itinerary with the appropriate markdown headings, bold words and other formatting.
"""</pre>



<p>We explain that it&#8217;s function is to call the <code>markdown_to_pdf_file</code> function we wrote passing in a full markdown summary with the image inserted as well. We give it specific instructions on how to format the image link in the markdown so it will work with our converter, and finally give it some last instructions on the structure we want.</p>



<p>Inside your <code>multi_agent_prompts.py</code> file you now have the following constants:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">TEAM_SUPERVISOR_SYSTEM_PROMPT = ...
TRAVEL_AGENT_SYSTEM_PROMPT = ...
LANGUAGE_ASSISTANT_SYSTEM_PROMPT = ...
VISUALIZER_SYSTEM_PROMPT = ...
DESIGNER_SYSTEM_PROMPT = ...</pre>



<h3 class="wp-block-heading">Creating our agents and nodes</h3>



<p>Go ahead and save and close the <code>multi_agent_prompts.py</code> file and let&#8217;s get back to the <code>multi_agent.py</code> file. First lets update our import up top with the other imports, changing it like this:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">#... all the other imports ...

from multi_agent_prompts import (
    TEAM_SUPERVISOR_SYSTEM_PROMPT,
    TRAVEL_AGENT_SYSTEM_PROMPT,
    LANGUAGE_ASSISTANT_SYSTEM_PROMPT,
    VISUALIZER_SYSTEM_PROMPT,
    DESIGNER_SYSTEM_PROMPT,
)</pre>



<p>Then go ahead and scroll all the way back down to the bottom of the file and let&#8217;s start creating some agents and nodes! First up is the travel agent:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">travel_agent = create_agent(LLM, [TAVILY_TOOL], TRAVEL_AGENT_SYSTEM_PROMPT)
travel_agent_node = functools.partial(
    agent_node, agent=travel_agent, name=TRAVEL_AGENT_NAME
)</pre>



<p>First we create the <code>travel_agent</code> by calling our <code>create_agent</code> function and passing in the <code>LLM</code>, a list with the <code>TAVILY_TOOL</code> in it as our list of tools, as we promised it an internet tool if it needed one, and the <code>TRAVEL_AGENT_SYSTEM_PROMPT</code>. We now have our travel agent / executor.</p>



<p>To get the travel agent&#8217;s node we need to use the <code>agent_node</code> function we defined before, which needs three arguments, the <code>agent</code>, the <code>state</code> and the <code>name</code> of the agent in string format. We have the <code>agent</code> and the <code>name</code> already, but the state will only be available at runtime. To solve this problem we can use the <code>functools.partial</code> function to create a new function that has the <code>agent</code> and <code>name</code> already filled in, and then we can pass in the <code>state</code> at runtime.</p>



<p>If you&#8217;re unfamiliar with <code>functools.partial</code>, it basically works like this:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">########### Example, not part of the code ############
# Original function
def multiply(x, y):
    return x * y

# Create a new function that multiplies by 2
multiply_by_two = functools.partial(multiply, x=2)

result = multiply_by_two(3)
print(result)  # Output: 6</pre>



<p>So it takes a function and creates a new function based on the original with a portion of the arguments already filled in, reducing the number of arguments the function takes in it&#8217;s new form. This is very useful as we now have our complete <code>travel_agent_node</code> that needs only the state object to be passed in for it to work.</p>



<p>Now in exactly the same manner we can create our <code>language_assistant</code>, <code>visualizer</code>, and <code>designer</code> agents and nodes:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">language_assistant = create_agent(LLM, [TAVILY_TOOL], LANGUAGE_ASSISTANT_SYSTEM_PROMPT)
language_assistant_node = functools.partial(
    agent_node, agent=language_assistant, name=LANGUAGE_ASSISTANT_NAME
)

visualizer = create_agent(LLM, [generate_image], VISUALIZER_SYSTEM_PROMPT)
visualizer_node = functools.partial(agent_node, agent=visualizer, name=VISUALIZER_NAME)

designer = create_agent(LLM, [markdown_to_pdf_file], DESIGNER_SYSTEM_PROMPT)
designer_node = functools.partial(agent_node, agent=designer, name=DESIGNER_NAME)</pre>



<p>The language assistant takes the <code>TAVILY_TOOL</code>, while our visualizer needs the <code>generate_image</code> and the designer the <code>markdown_to_pdf_file</code> tool. We then create the nodes for each of these agents in the same way we did for the travel agent above, passing in their respective names using the <code>...NAME</code> constants we defined up top.</p>



<h3 class="wp-block-heading">Creating the graph</h3>



<p>Time to create our graph and the nodes:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">workflow = StateGraph(AgentState)
workflow.add_node(TRAVEL_AGENT_NAME, travel_agent_node)
workflow.add_node(LANGUAGE_ASSISTANT_NAME, language_assistant_node)
workflow.add_node(VISUALIZER_NAME, visualizer_node)
workflow.add_node(DESIGNER_NAME, designer_node)
workflow.add_node(TEAM_SUPERVISOR_NAME, team_supervisor_chain)</pre>



<p>We initialize the <code>StateGraph</code> passing in our <code>AgentState</code> format we defined. Then we simply create a node for each agent passing in the name first, and the actual node second. Note that we&#8217;ve used these <code>...NAME</code> variables several times now, which is why we defined them up top as constants to give them only a single point of definition instead of repeating strings all over the place.</p>



<p>Now that we have the nodes let&#8217;s start building some connections:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">for member in MEMBERS:
    workflow.add_edge(member, TEAM_SUPERVISOR_NAME)

workflow.add_edge(DESIGNER_NAME, END)</pre>



<p>For every member in the list of team <code>MEMBERS</code> we add an edge back to the team supervisor, as it will decide where to go next between each step. We also add an edge from the designer to the <code>END</code> node as the designer is the last step in our graph and will exist outside of the team.</p>



<figure class="wp-block-image size-large"><img decoding="async" src="https://academy.finxter.com/wp-content/uploads/2024/04/image-4-1-1024x771.png" alt="" class="wp-image-4382"/></figure>



<p>So far we have this, and these are all hard edges with no conditions. Now it is time for us to add some conditional edges:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">conditional_map = {name: name for name in MEMBERS}
conditional_map["FINISH"] = DESIGNER_NAME
workflow.add_conditional_edges(
    TEAM_SUPERVISOR_NAME, lambda x: x["next"], conditional_map
)</pre>



<p>We create a <code>conditional_map</code> dictionary that maps each member to itself, and then we add a key <code>"FINISH"</code> that maps to the <code>DESIGNER_NAME</code>. So if the team supervisor calls on the <code>"visualizer"</code> this will simply map like <code>{"visualizer": "visualizer"}</code> but the one exception is the <code>{"FINISH": "designer"}</code> mapping.</p>



<p>We then call the <code>add_conditional_edges</code> method on the <code>workflow</code> object. This method takes the start point, so we pass in the <code>TEAM_SUPERVISOR_NAME</code>, a function that will return a value, and then a mapping that will map that value to the next desired node.</p>



<p>The function is a lambda that takes the state object as input and simply returns the state&#8217;s <code>next</code> key that the team supervisor has put in there. The <code>conditional_map</code> is the mapping we defined above, so if the team supervisor calls on a team member it will map to that team member&#8217;s node, but if it calls <code>"FINISH"</code> it will map to the <code>"designer"</code> node.</p>



<p>Now set the entry point and compile the graph:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">workflow.set_entry_point(TEAM_SUPERVISOR_NAME)

travel_agent_graph = workflow.compile()</pre>



<p>Our completed graph now looks like this:</p>



<figure class="wp-block-image size-full"><img decoding="async" src="https://academy.finxter.com/wp-content/uploads/2024/04/image-6-1.png" alt="" class="wp-image-4383"/></figure>



<p>Where the white lines represent the fixed edges and the dotted lines represent conditional ones. Now let&#8217;s actually give this a test run and see what happens!:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">for chunk in travel_agent_graph.stream(
    {"messages": [HumanMessage(content="I want to go to Paris for three days")]}
):
    if "__end__" not in chunk:
        print(chunk)
        print(f"{Fore.GREEN}#############################{Style.RESET_ALL}")</pre>



<p>So we&#8217;re going to call <code>stream</code> on the <code>travel_agent_graph</code> and pass in a dictionary with the <code>messages</code> key and a list with a single <code>HumanMessage</code> object in it, saying that we want to visit Paris. for three days. We then loop over the chunks and print them out, and then print a line of <code>#</code>s in green to visually separate the chunks.</p>



<p>Now go ahead and run this and let&#8217;s see what happens! You may see some printer message popup, again just click <code>X</code> on it if it pops up for now. When it&#8217;s done running have a look in your <code>output</code> folder for the final result:</p>



<figure class="wp-block-image size-large"><img decoding="async" src="https://academy.finxter.com/wp-content/uploads/2024/04/image-9-769x1024.png" alt="" class="wp-image-4384"/></figure>



<p>That is pretty darn cool right! Our whole team of AI agents is working together to do our bidding without any work on our part! Everything worked exactly as expected with the routing and everything, which you can confirm in your LangSmith dashboard (https://smith.langchain.com/) as well by checking out the trace for the run:</p>



<figure class="wp-block-image size-large"><img decoding="async" src="https://academy.finxter.com/wp-content/uploads/2024/04/image-3-2-1024x922.png" alt="" class="wp-image-4385"/></figure>



<p>We can see that after each step the system returns to the team supervisor and at the end it breaks out of the team towards the designer. I&#8217;ve done a bunch more test runs to verify that it works well and here are some example runs for other destinations:</p>



<figure class="wp-block-image size-large"><img decoding="async" src="https://academy.finxter.com/wp-content/uploads/2024/04/image-1-2-808x1024.png" alt="" class="wp-image-4386"/></figure>



<figure class="wp-block-image size-large"><img decoding="async" src="https://academy.finxter.com/wp-content/uploads/2024/04/image-2-2-808x1024.png" alt="" class="wp-image-4387"/></figure>



<p>Remember that I&#8217;ve been using the <code>gpt-3.5-turbo-0125</code> model all this time. You can easily swap out any of the models for <code>gpt-4-turbo</code> if you want more detail, or if you have some trouble with a specific node. Say the designer has trouble working consistently, you could just swap out only that node for a different model with a higher quality and leave the rest as is.</p>



<p>You can literally create just about any combination of agents, nodes, edges, and conditional edges you want. The combination possibilities are mind-boggling. We decided to have one agent outside of the team here, no problem! We can also have 2 teams or even more if we want, each with their own manager. Your imagination is the limit here.</p>



<p>That&#8217;s it for part 5! In the next and last part, we&#8217;ll take a look at writing and integrating asynchronous tools into our systems. I&#8217;ll see you there!</p>



<h2 class="wp-block-heading">Multi-Agent LangGraph &#8211; Web Research and Asynchronous Tools</h2>



<p>Hello and welcome back to the last part of the LangGraph/LangChain/LangSmith course. In this part, we&#8217;ll learn how to deal with asynchronous tools by building a graph that will do some web research for us, where one of the tools is going to be visiting several websites at once to feed info back into the graph.</p>



<p>This type of asynchronous action is very helpful when there are multiple steps or actions that can be performed at the same time for optimization as it will save a lot of time and make the user experience much better. It is a bit different to set up and work with though, which is why we&#8217;ll be going through it in this part.</p>



<p>I will try to cover the bare basics of async Python programming here, as it can look quite confusing and I want all skill levels to be able to follow along. If you are already very familiar with async programming the level of explanation may be a bit excessive for you and you can probably skip over some of the explanations and just look at the code.</p>



<h3 class="wp-block-heading">Web research tool</h3>



<p>Let&#8217;s start by building our tool as usual. This tool is going to visit a bunch of web URLs at the same time (asynchronously) and return the HTML content of each page. We will need to install the <code>BeautifulSoup</code> library to parse the HTML content of the pages. Run the following command in the terminal:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">pipenv install beautifulsoup4==4.12.3</pre>



<p>Then go ahead and create a new file called <code>web.py</code> in the <code>tools</code> directory:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c2.png" alt="📂" class="wp-smiley" style="height: 1em; max-height: 1em;" /> FINX_LANGGRAPH
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c2.png" alt="📂" class="wp-smiley" style="height: 1em; max-height: 1em;" /> images
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c2.png" alt="📂" class="wp-smiley" style="height: 1em; max-height: 1em;" /> output
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c2.png" alt="📂" class="wp-smiley" style="height: 1em; max-height: 1em;" /> tools
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> __init__.py
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> image.py
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> pdf.py
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> weather.py
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> web.py    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/2728.png" alt="✨" class="wp-smiley" style="height: 1em; max-height: 1em;" />New file
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> .env
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> langchain_basics.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> multi_agent.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> multi_agent_prompts.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> Pipfile
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> Pipfile.lock
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> setup_environment.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> simple_langgraph.py</pre>



<p>In the <code>web.py</code> file let&#8217;s start with our imports as usual:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">import asyncio
import json
import sys

import aiohttp
from bs4 import BeautifulSoup
from langchain.tools import tool
from pydantic import BaseModel, Field</pre>



<p>We import <code>asyncio</code> to work with asynchronous code, <code>aiohttp</code> to make HTTP requests asynchronously, and <code>BeautifulSoup</code> to parse the HTML content of the pages. The <code>tool</code> decorator and <code>pydantic</code> imports are the same as for the other tools and <code>json</code> is to return the JSON responses in string format.</p>



<h3 class="wp-block-heading">Async and event loops</h3>



<p>First of all, we&#8217;ll use the <code>sys</code> import to set the type of event loop to use for the asynchronous code:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">if sys.platform.startswith("win"):
    asyncio.set_event_loop_policy(asyncio.WindowsSelectorEventLoopPolicy())</pre>



<p>Without going into too much detail, there is a known issue with the Python <code>asyncio</code> library on Windows specifically that happens when the Proactor event loop (the default on Windows) is closed while there are still outstanding tasks. It doesn&#8217;t affect the correct execution of the code, but something on Windows + aysncio + LangChain/LangGraph triggers it. We&#8217;ll use the <code>selector</code> event loop policy to avoid this issue (this is only needed/triggers if you&#8217;re on Windows.).</p>



<p>While this tutorial part is way too short to really go in-depth on Python&#8217;s asynchronous programming, we&#8217;ll try to cover the basics as we go along. Basically, we get an event loop, and we can put tasks in there. Normally a task like fetching a webpage would block the code until it&#8217;s done, but with <code>asyncio</code> we can put it in the event loop and continue with other tasks while it&#8217;s being fetched. This allows us to run multiple operations at the same time.</p>



<p>This is not to be confused with multi-threading or multi-processing, which are quite different in nature:</p>



<ul class="wp-block-list">
<li>Multi-processing: is about spreading tasks over a computer&#8217;s CPU cores, and is well suited for tasks that require lots of mathematical computations.</li>



<li>Multi-threading: is about running multiple threads in the same process, and is well suited for tasks that are I/O bound (like fetching webpages).</li>



<li>Asynchronous programming: is a single-process, single-threaded design that uses coroutines to handle multiple tasks concurrently. Async functions are able to sort of pause and resume their execution, allowing other tasks to run in the meantime during this pause.</li>
</ul>



<p>Async programming in Python is very similar to the JavaScript async/await pattern, and it&#8217;s a great way to handle I/O-bound tasks like fetching web pages. If you&#8217;re a bit new to this all, just keep going and you&#8217;ll get a feel for how it works.</p>



<h3 class="wp-block-heading">Parsing HTML content</h3>



<p>First, we&#8217;ll write a very basic function that uses BeautifulSoup to parse some HTML content:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">def parse_html(html_content: str) -> str:
    soup = BeautifulSoup(html_content, "html.parser")
    for tag in ["nav", "footer", "aside", "script", "style", "img", "header"]:
        for match in soup.find_all(tag):
            match.decompose()

    text_content = soup.get_text()
    text_content = " ".join(text_content.split())
    return text_content[:8_000]</pre>



<p>This function takes the HTML content of a webpage as a string and returns a string with the text content of the page. First we instantiate a new BeautifulSoup object passing in the <code>html.parser</code> string to select the parser. We then make a list of all the HTML tags we want to filter out, namely the navigation, footer, aside, script, style, image and header tags. We&#8217;re interested in the main content and don&#8217;t want all this pollution.</p>



<p>For each tag in this list of HTML tags, we run <code>soup.find_all(tag)</code> to find all the tags with that name in the HTML content, which returns all the matches for that tag. This allows us to loop over each match in <code>soup.find_all(tag)</code> and call <code>match.decompose()</code> to remove the tag from the HTML content.</p>



<p>We then get the text content of the page with <code>soup.get_text()</code> to remove as much HTML and unneeded stuff as possible from what was left.</p>



<p>Then we call <code>text_content.split()</code> to split the text content into a list of words, which has the side effects of removing long sequences of whitespace, tab, and newline characters. We then join the list of words back into a string with <code>" ".join</code> so that we&#8217;re left with only a single space between all words to save space. The LLM does not care about formatting and sending tons of whitespace to it is just a waste of space.</p>



<p>Finally, we return the first 8,000 characters of the text content, to make sure we don&#8217;t exceed the context limit if we load like 5 or 6 pages at once. You can set this higher if you use GPT-4-turbo instead of 3.5-turbo</p>



<h3 class="wp-block-heading">Fetching webpages</h3>



<p>Notice that the <code>parse_html</code> function is just a regular synchronous function. Now let&#8217;s get started on the asynchronous part. The first thing we&#8217;ll do is write a function to fetch the HTML content of a single webpage, and then we can just call this function multiple times to fetch the content of multiple pages at once.</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">async def get_webpage_content(url: str) -> str:
    async with aiohttp.ClientSession() as session:
        async with session.get(url) as response:
            html_content = await response.text()

    text_content = parse_html(html_content)
    print(f"URL: {url} - fetched successfully.")
    return text_content</pre>



<p>First, we declare our async function using <code>async def</code> instead of the normal <code>def</code>. This will allow us to later call <code>await</code> on this function to make the code non-blocking and run other tasks while we wait for the response. We take a URL string and return a string.</p>



<p>Where we would normally fetch a webpage with the <code>requests</code> library, here we need to use <code>aiohttp</code> which is an asynchronous HTTP client/server library for Python that allows us to write this non-blocking code. The <code>ClientSession</code> object represents a single web session, so you could set headers or cookies here that apply to all requests in this session.</p>



<p>The whole thing can be used as a context manager giving us the <code>async with aiohttp.ClientSession() as session</code> syntax and any indented code afterward now takes place inside this context. Then we call <code>get(url)</code> on the session object and use that as a context manager in the same exact manner as the line above it.</p>



<p>The line after calls <code>await</code> on the <code>response.text()</code> and will then save this in the <code>html_content</code> variable. This <code>await</code> keyword is the magic, and whenever you see this keyword it sort of pauses this code, as time is needed to fetch the webpage. While this is happening, other tasks can run in the event loop.</p>



<p>When the <code>html_content</code> has finished fetching, we move outside of the two async context managers and call our <code>parse_html</code> function to get the text content of the page. We then print a message to the console that the URL was fetched successfully and return the text content.</p>



<p>Note that we could easily edit the above function to fetch the whole list of URLs we have inside the same <code>ClientSession</code> context manager, but as the overhead to calling this function a couple of times is minimal, I&#8217;ll just keep it as is for now.</p>



<p>Another fair point to make is that the <code>parse_html</code> function is technically blocking non-async code, but as it doesn&#8217;t take long to run at all, it&#8217;s fine to keep it here. The main time-waster is the fetching of the webpages and we made that asynchronous.</p>



<h3 class="wp-block-heading">Input arguments and the tool</h3>



<p>Before we get to the actual tool itself we need to make sure to define our pydantic object with the input arguments for the tool:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">class ResearchInput(BaseModel):
    research_urls: list[str] = Field(description="Must be a list of valid URLs.")</pre>



<p>No surprises here, we just want a list of URLs in string format. We&#8217;ve used this type of object several times before.</p>



<p>Now let&#8217;s write our tool, starting with the first half:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">@tool("research", args_schema=ResearchInput)
async def research(research_urls: list[str]) -> str:
    """Get content of provided URLs for research purposes."""
    tasks = [asyncio.create_task(get_webpage_content(url)) for url in research_urls]</pre>



<p>We use the <code>@tool</code> decorator to define our tool, passing in the name and the argument schema as always. We declare the function making sure to use <code>async def</code>, and we declare the same <code>research_urls</code> argument as we defined in the <code>ResearchInput</code> class. Again mind the docstring description for the LLM to use.</p>



<p>Then we use a list comprehension, let&#8217;s read it from the right to the left. for each <code>url</code> in the list of <code>research_urls</code>, we call <code>asyncio.create_task(get_webpage_content(url))</code> to create a task for each URL. The <code>asyncio.create_task()</code> function schedules the coroutine to run on the event loop and returns a Task object. However, it doesn&#8217;t automatically await the task.</p>



<p>What this means is that it will create our async task and also start it for us, but it won&#8217;t await it, or wait for it to finish, which would block the code. We are left with a list full of these task objects of tasks that are currently running but not yet finished.</p>



<p>Let&#8217;s finish our tool:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">@tool("research", args_schema=ResearchInput)
async def research(research_urls: list[str]) -> str:
    """Get content of provided URLs for research purposes."""
    tasks = [asyncio.create_task(get_webpage_content(url)) for url in research_urls]
    contents = await asyncio.gather(*tasks, return_exceptions=True)
    return json.dumps(contents)</pre>



<p>The <code>asyncio.gather()</code> function is used to schedule multiple tasks to run and waits for all of them to complete. It will wait for all our tasks from the previous line to fetch their web pages and then gather the results. This is why we <code>await</code> this function, and then save the results in <code>contents</code>. <code>*tasks</code> is a way to unpack the list of tasks into separate arguments passing them into the function.</p>



<p>The <code>return_exceptions</code> parameter in <code>asyncio.gather()</code> determines how exceptions are handled. If <code>return_exceptions</code> is set to <code>False</code>, <code>gather()</code> will immediately raise the first exception it encounters. When set to <code>True</code>, instead of raising exceptions, it will return them in the result list so that <code>contents</code> will be a list of results or exceptions. We use this as we want to go ahead and fetch the rest of the pages even if one fails.</p>



<p>Finally, dump the response to a JSON string and return it, as naturally, LLMs need string input.</p>



<h3 class="wp-block-heading">Testing the tool</h3>



<p>Now let&#8217;s add a quick test to this file to test our tool in isolation and make sure there are no problems:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">if __name__ == "__main__":
    import time

    TEST_URLS = [
        "https://en.wikipedia.org/wiki/SpongeBob_SquarePants",
        "https://en.wikipedia.org/wiki/Stephen_Hillenburg",
        "https://en.wikipedia.org/wiki/The_SpongeBob_Movie:_Sponge_Out_of_Water",
    ]

    async def main():
        result = await research.ainvoke({"research_urls": TEST_URLS})

        with open("test.json", "w") as f:
            json.dump(result, f)

    start_time = time.time()
    asyncio.run(main())
    end_time = time.time()
    print(f"Async time: {end_time - start_time} seconds")</pre>



<p>We&#8217;ve covered the <code>if __name__ == "__main__":</code> block before, so only if we run this file directly will the code inside this block run. We define a list of test URLs to use and then define an async function called <code>main()</code> to run our tool with these test URLs. Instead of invoking the tool as we normally do we now use <code>ainvoke</code> for the async version, and we have to <code>await</code> the result. This is why the main function is <code>async</code> as well.</p>



<p>We then open a file called <code>test.json</code> in write mode and dump the result to it so we can have a quick look to check if the output is as expected. Finally we run the <code>main()</code> function with <code>asyncio.run(main())</code></p>



<p><code>asyncio.run</code> is a useful function that creates a new event loop, runs the given coroutine which is <code>main</code> in our case, closes the loop, and then returns the result. This makes it a convenient way to run async code from a synchronous context as it handles the whole event loop thing for us.</p>



<p>I&#8217;ve also sneaked a start and end timer in there using <code>time.time()</code> to see how long it takes to run the async code.</p>



<p>Now go ahead and run the <code>web.py</code> file and you&#8217;ll see something like this:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">URL: https://en.wikipedia.org/wiki/The_SpongeBob_Movie:_Sponge_Out_of_Water - fetched successfully.
URL: https://en.wikipedia.org/wiki/Stephen_Hillenburg - fetched successfully.
URL: https://en.wikipedia.org/wiki/SpongeBob_SquarePants - fetched successfully.
Async time: 2.9387967586517334 seconds</pre>



<p>I have also tried the synchronous normal version of this code using the requests library, and it took over 7 seconds, so we have a considerable time save here, and this is with only 3 URLs. If I increase the number of URLs to just 6, the async version takes about 4 seconds, while the synchronous version takes like 14.</p>



<p>If you open the <code>test.json</code> file that has been created you should see something like the following that goes on for quite a while:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">"[\"SpongeBob SquarePants - Wikipedia Jump to content From Wikipedia, the free encyclopedia American animated television series This article is about the television series.........</pre>



<h3 class="wp-block-heading">Web research graph setup</h3>



<p>We have just written our first async tool! Now let&#8217;s put it to good use and write up a quick web research graph. In your root folder create two new files called <code>web_research.py</code> and <code>web_research_prompts.py</code>:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c2.png" alt="📂" class="wp-smiley" style="height: 1em; max-height: 1em;" /> FINX_LANGGRAPH
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c2.png" alt="📂" class="wp-smiley" style="height: 1em; max-height: 1em;" /> images
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c2.png" alt="📂" class="wp-smiley" style="height: 1em; max-height: 1em;" /> output
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c2.png" alt="📂" class="wp-smiley" style="height: 1em; max-height: 1em;" /> tools
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> __init__.py
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> image.py
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> pdf.py
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> weather.py
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> web.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> .env
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> langchain_basics.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> multi_agent.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> multi_agent_prompts.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> Pipfile
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> Pipfile.lock
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> setup_environment.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> simple_langgraph.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> web_research.py          <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/2728.png" alt="✨" class="wp-smiley" style="height: 1em; max-height: 1em;" />New file
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> web_research_prompts.py    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/2728.png" alt="✨" class="wp-smiley" style="height: 1em; max-height: 1em;" />New file</pre>



<p>The graph here will be reasonably simple, having two agents. One of them will use Tavily to do a basic search query, and the other one will use our async tool to do more in-depth research on the URLs provided by the first Tavily agent. You know the drill by now, so we&#8217;ll just define our system prompts for the agents before we get started on the main file. If you&#8217;re watching the video version of this tutorial make sure you open up the written version so you can more easily copy these. Start by opening up the <code>web_research_prompts.py</code> file.</p>



<p>We&#8217;ll get started with the Tavily agent&#8217;s system prompt first:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">TAVILY_AGENT_SYSTEM_PROMPT = """
You are a search agent. Your tasks is simple. Use your tool to find results on the internet for the user query, and return the response, making sure to include all the sources with page title and URL at the bottom like this example:

1. [Title 1](https://www.url1.com/whatever): ...
2. [Title 2](https://www.url2.com/whatever): ...
3. [Title 3](https://www.url3.com/whatever): ...
4. [Title 4](https://www.url4.com/whatever): ...
5. [Title 5](https://www.url5.com/whatever): ...

Make sure you only return the URLs that are relevant for doing additional research. For instance:
User query Spongebob results from calling your tool:

1. [The SpongeBob Official Channel on YouTube](https://www.youtube.com/channel/UCx27Pkk8plpiosF14qXq-VA): ...
2. [Wikipedia - SpongeBob SquarePants](https://en.wikipedia.org/wiki/SpongeBob_SquarePants): ...
3. [Nickelodeon - SpongeBob SquarePants](https://www.nick.com/shows/spongebob-squarepants): ...
4. [Wikipedia - Excavators](https://en.wikipedia.org/wiki/Excavator): ...
5. [IMDB - SpongeBob SquarePants TV Series](https://www.imdb.com/title/tt0206512/): ...


Given the results above and an example topic of Spongebob, the Youtube channel is going to be relatively useless for written research, so you should skip it from your list. The Wikipedia article on Excavators is not related to the topic, which is Spongebob for this example, so it should be omitted. The others are relevant so you should include them in your response like this:
1. [Wikipedia - SpongeBob SquarePants](https://en.wikipedia.org/wiki/SpongeBob_SquarePants): ...
2. [Nickelodeon - SpongeBob SquarePants](https://www.nick.com/shows/spongebob-squarepants): ...
3. [IMDB - SpongeBob SquarePants TV Series](https://www.imdb.com/title/tt0206512/): ...
"""</pre>



<p>This is a bit of a long prompt, but it&#8217;s quite simple. The Tavily agent is tasked with finding relevant URLs for a given query, and then returning the URLs that are relevant for further research. The prompt gives an example of what the response should look like and also gives an example of what URLs are relevant and what URLs are not.</p>



<p>Now let&#8217;s define the system prompt for the web research agent:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">RESEARCHER_SYSTEM_PROMPT = """
You are an internet research information-providing agent. You will receive results for a search query. The results will look something like this:

1. [Wikipedia - SpongeBob SquarePants](https://en.wikipedia.org/wiki/SpongeBob_SquarePants): ...
2. [Nickelodeon - SpongeBob SquarePants](https://www.nick.com/shows/spongebob-squarepants): ...
3. [IMDB - SpongeBob SquarePants TV Series](https://www.imdb.com/title/tt0206512/): ...

Your job is to use your research tool to find more information on the topic and to write an article about the information you find in markdown format. You will call the research tool with a list of URLs, so for the above example your tool input will be:

["https://en.wikipedia.org/wiki/SpongeBob_SquarePants", "https://www.nick.com/shows/spongebob-squarepants", "https://www.imdb.com/title/tt0206512/"]

After you have finished your research you will write a long-form article on all the information you found and return it to the user, making sure not to leave out any relevant details. Make sure you include as much detail as possible and that the article you write is on the topic (for instance Pokemon) instead of being about the websites that you visited (e.g. Wikipedia, YouTube). Use markdown formatting and supply ONLY the resulting article in your response, with no extra chatter except for the fully formed, well-written, and formatted article. Use headers, sub-headers, bolding, bullet lists, and other markdown formatting to make the article easy to read and understand. Your only output will be the fully formed and detailed markdown article.
"""</pre>



<p>The agent is tasked with using the web research tool to find more information on a topic and then writing an article about the information found. The prompt gives an example of what the input to the tool should look like and then specific instructions on using markdown formatting to write the output article and details on the article we want it to write. Save and close the <code>web_research_prompts.py</code> file.</p>



<h3 class="wp-block-heading">Web research graph main file</h3>



<p>Now let&#8217;s move on to the main file <code>web_research.py</code> and start by importing the necessary modules:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">import asyncio
import functools
import operator
import uuid
from typing import Annotated, Sequence, TypedDict

from langchain.agents import AgentExecutor, create_openai_tools_agent
from langchain_community.tools.tavily_search import TavilySearchResults
from langchain_core.messages import BaseMessage, HumanMessage
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_openai import ChatOpenAI
from langgraph.graph import END, StateGraph

from setup_environment import set_environment_variables
from tools.pdf import OUTPUT_DIRECTORY
from tools.web import research
from web_research_prompts import RESEARCHER_SYSTEM_PROMPT, TAVILY_AGENT_SYSTEM_PROMPT</pre>



<p>You&#8217;ve seen pretty much all of these imports before in some part of our code so far. We import the prompts we just created and the web research function as well as the <code>OUTPUT_DIRECTORY</code> we defined in the <code>pdf.py</code> file so that we can access this folder to save our output. To do this properly it would be best to store these project-wide constants like the paths in a separate file but for now, we&#8217;ll just import it from <code>pdf.py</code>.</p>



<p>Now continue below the imports:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">set_environment_variables("Web_Search_Graph")

TAVILY_TOOL = TavilySearchResults(max_results=6)
LLM = ChatOpenAI(model="gpt-3.5-turbo-0125")

TAVILY_AGENT_NAME = "tavily_agent"
RESEARCH_AGENT_NAME = "search_evaluator_agent"
SAVE_FILE_NODE_NAME = "save_file"</pre>



<p>We load up our variables and use the project name <code>Web_Search_Graph</code> for our LangSmith traces. We create a new instance of the Tavily search tool we imported setting the <code>max_results</code> to 6, and we create a <code>ChatOpenAI</code> object as usual. After that we set up some string constants for the names of our agents and nodes again.</p>



<p>We&#8217;ll have the <code>create_agent</code> function which is basically the same as last time:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">def create_agent(llm: ChatOpenAI, tools: list, system_prompt: str):
    prompt = ChatPromptTemplate.from_messages(
        [
            ("system", system_prompt),
            MessagesPlaceholder(variable_name="messages"),
            MessagesPlaceholder(variable_name="agent_scratchpad"),
        ]
    )
    agent = create_openai_tools_agent(llm, tools, prompt)
    executor = AgentExecutor(agent=agent, tools=tools)  # type: ignore
    return executor</pre>



<p>No real changes there so let&#8217;s move on to the AgentState definition:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">class AgentState(TypedDict):
    messages: Annotated[Sequence[BaseMessage], operator.add]</pre>



<p>We have a simple list of <code>BaseMessage</code> objects and every node in the graph will add a message to this list as the state passes through that particular node.</p>



<h3 class="wp-block-heading">Creating our nodes</h3>



<p>Now we&#8217;ll have a basic function to create a new agent node similar to what we&#8217;ve done before, but this time we&#8217;ll also have one to create an async agent node:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">def agent_node(state: AgentState, agent, name):
    result = agent.invoke(state)
    return {"messages": [HumanMessage(content=result["output"], name=name)]}


async def async_agent_node(state: AgentState, agent, name):
    result = await agent.ainvoke(state)
    return {"messages": [HumanMessage(content=result["output"], name=name)]}</pre>



<p>The first one is pretty much the same as the one we used before, making sure we return a message in line with what we defined each node will add to the state object. The second one is the async version of the same function. We use <code>async def</code> and here we <code>await</code> the agent&#8217;s <code>ainvoke</code> method instead of just calling the normal <code>invoke</code> method.</p>



<p>Now we can create our Tavily agent and our research agent:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">tavily_agent = create_agent(LLM, [TAVILY_TOOL], TAVILY_AGENT_SYSTEM_PROMPT)
tavily_agent_node = functools.partial(
    agent_node, agent=tavily_agent, name=TAVILY_AGENT_NAME
)


research_agent = create_agent(LLM, [research], RESEARCHER_SYSTEM_PROMPT)
research_agent_node = functools.partial(
    async_agent_node, agent=research_agent, name=RESEARCH_AGENT_NAME
)</pre>



<p>You&#8217;ve seen all of this before, but make sure you use the <code>async_agent_node</code> function for the research agent instead of the normal one.</p>



<p>Now we need one more node, that will take the output of the research agent and write it to a file for us. This node does not need any agents or LLM action, so we can just define it as a normal function:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">def save_file_node(state: AgentState):
    markdown_content = str(state["messages"][-1].content)
    filename = f"{OUTPUT_DIRECTORY}/{uuid.uuid4()}.md"
    with open(filename, "w", encoding="utf-8") as file:
        file.write(markdown_content)
    return {
        "messages": [
            HumanMessage(
                content=f"Output written successfully to {filename}",
                name=SAVE_FILE_NODE_NAME,
            )
        ]
    }</pre>



<p>This shows that the graph is really nothing but a state machine. We can just write any arbitrary function and use it as a node as long as we meet the conditions we set for the graph. The function takes the AgentState object as input, does whatever it wants to do, and then adds an update to the AgentState object as promised. It doesn&#8217;t matter that there is no agent or LLM in this step.</p>



<p>In this case, we extract the markdown content from the state object&#8217;s last message <code>[-1]</code> which is the research node&#8217;s output. We then generate a random filename using the <code>uuid</code> module and write the markdown content to a file with that name and the <code>.md</code> extension. Finally, we return a message to the state object that the output was written successfully.</p>



<h3 class="wp-block-heading">Piecing our graph together</h3>



<p>Now we can define our graph:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">workflow = StateGraph(AgentState)
workflow.add_node(TAVILY_AGENT_NAME, tavily_agent_node)
workflow.add_node(RESEARCH_AGENT_NAME, research_agent_node)
workflow.add_node(SAVE_FILE_NODE_NAME, save_file_node)

workflow.add_edge(TAVILY_AGENT_NAME, RESEARCH_AGENT_NAME)
workflow.add_edge(RESEARCH_AGENT_NAME, SAVE_FILE_NODE_NAME)
workflow.add_edge(SAVE_FILE_NODE_NAME, END)

workflow.set_entry_point(TAVILY_AGENT_NAME)
research_graph = workflow.compile()</pre>



<p>We just go from the Tavily agent to the research agent, and then from the research agent to the save file node. This example is pretty simple as we&#8217;re focusing on the async part. We can always add this to more complex graphs later on if we need to.</p>



<p>Now let&#8217;s create a main function to run the graph:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">async def run_research_graph(input):
    async for output in research_graph.astream(input):
        for node_name, output_value in output.items():
            print("---")
            print(f"Output from node '{node_name}':")
            print(output_value)
        print("\n---\n")</pre>



<p>This function is an async function that takes an input and then runs the graph with that input. It uses an async for loop to iterate over the output of the graph after we run <code>astream</code> (async stream) on it. For each output, we get the node&#8217;s name and the output value, so we print both to the console to see what is going on live.</p>



<p>Now we can run the graph with a simple test input:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">test_input = {"messages": [HumanMessage(content="Jaws")]}

asyncio.run(run_research_graph(test_input))</pre>



<p>We create the first input message for the state object and then use <code>asyncio.run</code> as we did before because it takes care of the event loop that runs the async code for us. Save and run this file and you should see the graph running and outputting the results to the console:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">API Keys loaded and tracing set with project name:  Web_Search_Graph
Output from node 'tavily_agent':
---
{'messages': [HumanMessage(content='Here are some relevant sources about "Jaws": ... ', name='tavily_agent')]}

---

URL: https://www.imdb.com/title/tt0073195/ - fetched successfully.
URL: https://www.rottentomatoes.com/m/jaws - fetched successfully.
URL: https://www.britannica.com/topic/Jaws-film-by-Spielberg - fetched successfully.
URL: https://en.wikipedia.org/wiki/Jaws_(film) - fetched successfully.
Output from node 'search_evaluator_agent':
---
{'messages': [HumanMessage(content='# **Jaws: A Deep Dive into the Iconic Film**\n\n## markdown summary here... ', name='search_evaluator_agent')]}

---

Output from node 'save_file':
---
{'messages': [HumanMessage(content='Output written successfully to c:\\Coding_Vault\\FINX_LANGGRAPH_TUTS\\output/d22855f8-9f76-4fc6-8192-7490852e1644.md', name='save_file')]}

---

Output from node '__end__':
---
{'messages': ['The whole state object...']}

---</pre>



<p>Go ahead and open the <code>.md</code> file that was created in the <code>output</code> folder and you should see the markdown article that was written by the research agent:</p>



<figure class="wp-block-image size-large"><img decoding="async" src="https://academy.finxter.com/wp-content/uploads/2024/04/image-10-691x1024.png" alt="" class="wp-image-4391"/></figure>



<p>I&#8217;ve gone ahead and tried another one inputting the topic &#8220;Pokemon&#8221;:</p>



<figure class="wp-block-image size-large"><img decoding="async" src="https://academy.finxter.com/wp-content/uploads/2024/04/image-1-3-597x1024.png" alt="" class="wp-image-4392"/></figure>



<p>There you go! We&#8217;ve created a pretty fast and very useful internet research and article-writing tool!</p>



<p>From here on we can create PDF files, send emails, write articles, or do anything and everything we want really. We can tweak the output or the number of input URLs, or use gpt-4-turbo if we want a very long output article and large input context window so we can use even more sources.</p>



<p>We can add any conditional edges and paths and have the agents do whatever we want! All we&#8217;ve shown is just the basic ways in which you can combine stuff. You now have all the knowledge you need to build whatever you want. I&#8217;ll leave the rest up to your imagination.</p>



<p>It&#8217;s been a pleasure to take this journey together. I hope you learned a lot and had some fun along the way. I&#8217;ll see you again soon in the next one, until then, happy coding!</p>



<p class="has-global-color-8-background-color has-background"><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4a1.png" alt="💡" class="wp-smiley" style="height: 1em; max-height: 1em;" /> <strong>Info</strong>: This course is a complete text tutorial. It&#8217;s based on our academy course. If you&#8217;re interested in video explainers, <a href="https://academy.finxter.com/university/multi-agent-systems-langgraph/">check out the course here</a>.</p>



<p>By the way, we&#8217;ve been featured on <a href="https://blog.feedspot.com/python_blogs/">Feedspot&#8217;s Top 10 Python Blogs</a> list! <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f973.png" alt="🥳" class="wp-smiley" style="height: 1em; max-height: 1em;" /> </p>
<p>The post <a href="https://blog.finxter.com/building-complex-multi-agent-teams-and-setups-with-langgraph/">Building Complex Multi-Agent Teams and Setups with LangGraph</a> appeared first on <a href="https://blog.finxter.com">Be on the Right Side of Change</a>.</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>[Full Tutorial] OpenAI Fine-Tuning: Creating a Chatbot of Yourself (Example: ChrisGPT)</title>
		<link>https://blog.finxter.com/openai-fine-tuning/</link>
		
		<dc:creator><![CDATA[Dirk van Meerveld]]></dc:creator>
		<pubDate>Sat, 06 Apr 2024 07:43:04 +0000</pubDate>
				<category><![CDATA[Artificial Intelligence]]></category>
		<category><![CDATA[ChatGPT]]></category>
		<category><![CDATA[Large Language Model (LLM)]]></category>
		<category><![CDATA[OpenAI]]></category>
		<category><![CDATA[Prompt Engineering]]></category>
		<category><![CDATA[Python]]></category>
		<guid isPermaLink="false">https://blog.finxter.com/?p=1669992</guid>

					<description><![CDATA[<p>Hi and welcome to this course on fine-tuning ChatGPT with the OpenAI API. In this course we&#8217;ll not just look at how to deal with the fine-tuning API endpoints itself, but also how to run the fine-tuned model, and most importantly, the data preparation and validation steps so that you will be able to do ... <a title="[Full Tutorial] OpenAI Fine-Tuning: Creating a Chatbot of Yourself (Example: ChrisGPT)" class="read-more" href="https://blog.finxter.com/openai-fine-tuning/" aria-label="Read more about [Full Tutorial] OpenAI Fine-Tuning: Creating a Chatbot of Yourself (Example: ChrisGPT)">Read more</a></p>
<p>The post <a href="https://blog.finxter.com/openai-fine-tuning/">[Full Tutorial] OpenAI Fine-Tuning: Creating a Chatbot of Yourself (Example: ChrisGPT)</a> appeared first on <a href="https://blog.finxter.com">Be on the Right Side of Change</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<p>Hi and welcome to this course on fine-tuning ChatGPT with the OpenAI API. In this course we&#8217;ll not just look at how to deal with the fine-tuning API endpoints itself, but also how to run the fine-tuned model, and most importantly, the data preparation and validation steps so that you will be able to do this yourself using any data you want to when you are done.</p>



<p class="has-base-2-background-color has-background"><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4a1.png" alt="💡" class="wp-smiley" style="height: 1em; max-height: 1em;" /> <strong>Tip</strong>: This is a full-text tutorial on how to fine-tune ChatGPT using the OpenAI API with code! For a video guide-through, check out our premium course &#8220;<a href="https://academy.finxter.com/university/fine-tuning-openai-course/">Fine-Tuning OpenAI: How to Create and Scale Your Digital Self (Example: ChrisGPT)</a>&#8220;</p>



<p>For our fine-tuning example, we have a really fun one here. We are going to clone Chris! Yes, you read that correctly<img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f609.png" alt="😉" class="wp-smiley" style="height: 1em; max-height: 1em;" />. We will use the Finxter emails Chris always sends us to train our own ChrisGPT! ChrisGPT is going to learn how to write emails in the distinctive Finxter style, on any topic we choose!</p>



<ul class="wp-block-list">
<li>In part 1 we&#8217;ll start by looking at what fine-tuning is and why we would want to do it, and look at the type of data we will need in order to fine-tune our own ChatGPT models, getting started on our data preparation.</li>



<li>Part 2 will have us really dive into the data preparation, also using ChatGPT to make our work a bit easier, creating a perfectly formatted JSON dataset for training ChrisGPT.</li>



<li>In part 3 we&#8217;ll look at flattening our data into JSONL format, and then data validation to make sure we get it right the first time. We&#8217;ll also calculate the cost so that we know exactly what we&#8217;re getting into.</li>



<li>Part 4 is all about the fine-tuning API and how to use it. We&#8217;ll train ChrisGPT and then have a lot of fun having our own Robo<img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f916.png" alt="🤖" class="wp-smiley" style="height: 1em; max-height: 1em;" />Chris<img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f9d1-1f3fc.png" alt="🧑🏼" class="wp-smiley" style="height: 1em; max-height: 1em;" /> write our emails about anything we want!</li>
</ul>



<p>I hope you&#8217;re as excited as I am to get started. Let&#8217;s dive in!</p>



<h2 class="wp-block-heading">Part 1: Fine-tuning Our Own ChatGPT Model</h2>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img fetchpriority="high" decoding="async" width="788" height="788" src="https://blog.finxter.com/wp-content/uploads/2024/03/finxter.com_painting_of_a_programmer_with_hoodie_in_the_style_o_302b72c1-5867-44cf-b525-4cf1b631bf5b.webp" alt="" class="wp-image-1669706" srcset="https://blog.finxter.com/wp-content/uploads/2024/03/finxter.com_painting_of_a_programmer_with_hoodie_in_the_style_o_302b72c1-5867-44cf-b525-4cf1b631bf5b.webp 788w, https://blog.finxter.com/wp-content/uploads/2024/03/finxter.com_painting_of_a_programmer_with_hoodie_in_the_style_o_302b72c1-5867-44cf-b525-4cf1b631bf5b-300x300.webp 300w, https://blog.finxter.com/wp-content/uploads/2024/03/finxter.com_painting_of_a_programmer_with_hoodie_in_the_style_o_302b72c1-5867-44cf-b525-4cf1b631bf5b-150x150.webp 150w, https://blog.finxter.com/wp-content/uploads/2024/03/finxter.com_painting_of_a_programmer_with_hoodie_in_the_style_o_302b72c1-5867-44cf-b525-4cf1b631bf5b-768x768.webp 768w" sizes="(max-width: 788px) 100vw, 788px" /></figure>
</div>


<p>Welcome to this course on ChatGPT fine-tuning. My name is Dirk van Meerveld and together we will be taking a look at fine-tuning ChatGPT to make our own custom versions of the popular LLM. Before we dive in we&#8217;ll take a look at what fine-tuning entails and when we should use it.</p>



<h3 class="wp-block-heading">Why fine-tuning?</h3>



<p>First of all, let&#8217;s take a moment to consider how we usually get ChatGPT to do what we want. We tell it, using a prompt message right? Basically, we tell it what we want to do, and we probably give it some examples as well if the task has any complexity to it. This is called &#8220;few-shot-learning&#8221; as we give a couple of demonstrations on how to perform the task.</p>



<p>So usually prompt engineering will get ChatGPT to do whatever we want and there&#8217;s not really any problem with that right? But what if the problem we want ChatGPT to solve is a bit more complex, and would require hundreds of reasonably sized examples? There are several use cases for this, but we&#8217;ll start with an example on brand identity.</p>



<p>Say that your company brand has a certain style and tone of communication, which is different from the default ChatGPT way of speaking. You are probably not a famous person, so you can&#8217;t just query GPT to write &#8220;In the style of Elon Musk&#8221; or &#8220;In the style of Barack Obama&#8221;. ChatGPT doesn&#8217;t know who you are!</p>



<p>So what do you do? Use the very largest GPT-4-turbo model with the largest context limit and just send 100 pages full of examples of your brand&#8217;s style of communication in the prompt setup message every single time? This will not work very well for several reasons:</p>



<ul class="wp-block-list">
<li>Cost -&gt; Sending that much information with every GPT call, especially when using the most expensive GPT4 model, will be very expensive if you scale it up.</li>



<li>Latency -&gt; Your calls will not only be expensive but also slow in comparison, as the amount of data sent and processed is very large.</li>



<li>The normal model will have trouble learning an entire way of speaking including the tone and nuances from just a single system prompt setup message, even if it is very long. The input text is just a prompt and this style of speaking will not get &#8217;embedded into the neurons&#8217; of the model so to speak.</li>
</ul>



<p>This is where fine-tuning comes to the rescue. Basically, OpenAI will give us a vanilla GPT model in a separate container. We then get to provide extra training data of our own, and OpenAI will further train the GPT model on the data we provide, creating our own custom fine-tuned version of ChatGPT.</p>



<p>We feed it a large amount of examples of our brand&#8217;s style of communication. This way we won&#8217;t have to send a million tokens in the context limit every time and can just query our custom-trained model which has our brand&#8217;s style of communication embedded into its very neurons!</p>



<p>I think you can see how this would be extremely helpful in many areas. A content creator may want some help writing initial drafts or ideas for new work but needs them to adhere to his own writing style. A large brand company may want to employ customer service bots, like all do these days, but needs them to adhere to the brand&#8217;s style and rules for communication, just like the human employees. Anyone with any kind of writing or speaking style may want some assistance from ChatGPT but in their own style and form of speech.</p>



<h3 class="wp-block-heading">Let&#8217;s clone Chris!</h3>



<p>To explore this idea and show you how to implement this for yourself or your clients using example data of their writing, we will be using an example most of you will be familiar with, Chris! Most of you will be familiar with Chris&#8217; writing from the Finxter emails as you&#8217;ve probably received a fair amount of them if you&#8217;re taking this Finxter course. Today we are going to make ChrisGPT, a model that has been fine-tuned on Chris&#8217; writing style.</p>



<p>I&#8217;ve chosen Chris as an example for several reasons:</p>



<ol class="wp-block-list">
<li>Most of you are probably familiar with him from the emails.</li>



<li>He&#8217;s not so famous that we could just query GPT to write &#8220;In the style of Chris&#8221; and get a good result. This makes it into a realistic example of doing this for yourself or a client.</li>



<li>He has a distinct style of writing that we will be able to differentiate from the default ChatGPT style.</li>



<li>I have loads of data for him in the form of the Finxter emails on my email account.</li>



<li>He has agreed to let us do this (thanks Chris! <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f609.png" alt="😉" class="wp-smiley" style="height: 1em; max-height: 1em;" />).</li>
</ol>



<p>Of course, Finxter emails from my Gmail inbox aren&#8217;t perfectly clean ideal-world data examples, but they will be good enough to give you an idea of how fine-tuning works and how you can use it to implement a brand identity or writing style into ChatGPT with the proper data. They will also add some fun to our exploration and learning experience!</p>



<p>Now I could just give you the perfectly formatted data so that you can send it straight to the OpenAI API for fine-tuning and we could have the whole thing over and done with in a single video. But if I did that I&#8217;d be doing you a huge disservice. The fine-tuning API itself is very simple and doesn&#8217;t take much time to learn at all, but it&#8217;s only a small part of the picture.</p>



<p>The real challenge is getting your input data, and getting the input data right. There is a real process to getting the right training data, making sure you formulate it right and avoid the pitfalls, and then getting all the training data into properly formulated examples. The fine-tuning part itself is the easy part. We will look at the nitty-gritty details of getting the data right, as our GPT will only ever be as good as our training data.</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">Garbage in = Garbage out.</pre>



<h3 class="wp-block-heading">So what kind of data do we need?</h3>



<p>Well, basically we need to show ChatGPT what we want, by giving it examples. These examples need to be in the same format as normal ChatGPT chat completions, which makes total sense, as we&#8217;re trying to teach it what kind of chat completions we want it to create right? So the basic idea is something like this:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">[
  {
    "messages": [
      {
        "role": "system",
        "content": "You are a Harry Potter wizard spell creator who comes up with new magical spells! Alohamora!"
      },
      { "role": "user", "content": "How can I bake a cookie?" },
      {
        "role": "assistant",
        "content": "Confectio Deliciosum! This spell will help you bake cookies in no time!"
      }
    ]
  },
  {
    "messages": [
      {
        "role": "system",
        "content": "You are a Harry Potter wizard spell creator who comes up with new magical spells! Alohamora!"
      },
      { "role": "user", "content": "I need to fend off a shark!" },
      {
        "role": "assistant",
        "content": "Squalus Repellum! This spell will help you fend off a shark!"
      }
    ]
  },
  {
    "messages": [
      {
        "role": "system",
        "content": "You are a Harry Potter wizard spell creator who comes up with new magical spells! Alohamora!"
      },
      { "role": "user", "content": "I need to lose some weight fast." },
      {
        "role": "assistant",
        "content": "Adiposum Reductum! Your belly fat will just fly out the window!"
      }
    ]
  }
]</pre>



<p>Notice that we have a list of objects, which look like the <code>messages</code> list we would typically work with. We are basically simulating conversations with ChatGPT that never occurred, so we can use these as training data. For each example we have a system message, then we have the user query in the second slot, and the assistant&#8217;s response in the third slot.</p>



<p>Now notice that the entire prompt message is sent and is exactly the same for all three examples. You might feel that this is wasteful, as we&#8217;re repeating the same thing over and over again, but remember that we&#8217;re trying to sort of hard-wire these instructions into ChatGPT, so this is a good thing.</p>



<p>If you try to leave out the system messages to save tokens the model will have to learn entirely through demonstration of the outcome, and your training may actually take longer. We recommend just leaving that system message in there for every example, and make sure it&#8217;s a good one because it is going to get baked into the model!</p>



<p>The second entry, the user query, is obviously going to be different each time. Make sure you include examples that match the kind of use you want to use your final fine-tuned model for. Especially make sure you include any edge cases and harder-than-usual examples, the training phase is the time to show the model what it will be up against.</p>



<p>The third entry, the assistant&#8217;s response, is going to be the exact perfect answer that we want ChatGPT to give for this query. ChatGPT will be trained on this system message, with this query, this is the response I should give.</p>



<p>Note the example above is of course useless, as we could easily achieve this output without any fine-tuning at all from basically any LLM in existence. It is just an example of the training data structure. In reality, we need at least 10 examples for fine-tuning, but you should probably aim for at the very least 50 well-crafted examples if not more.</p>



<p>Also, the final format needs to be in JSONL format, with every object flattened down onto a single very long line, which looks kind of like this:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">{"messages": [{system...}, {user...}, {assistant...}]}
{"messages": [{system...}, {user...}, {assistant...}]}
{"messages": [{system...}, {user...}, {assistant...}]}</pre>



<p>But this is only a minor and easy conversion, so we&#8217;ll get back to that later.</p>



<p>As for the length, each training example is limited to the context length of the model. So every single line of the JSONL data can be up to the context limit, which for gpt-3.5-turbo-1106 is 16,385 tokens. As this is a very high amount, we&#8217;re not going to worry about it too much for our use cases here, as we&#8217;re not going to be going over this limit.</p>



<p>Now we&#8217;ll be using gpt-3.5-turbo-1106 here as it is the newest version of the model that has fine-tuning support so far. This is probably a good thing though as fine-tuning on GPT-4 would be a lot more expensive and as we&#8217;ll be showing the model exactly what we want it to do anyway, we won&#8217;t really need GPT-4&#8217;s extra capabilities.</p>



<h3 class="wp-block-heading">The data</h3>



<p>So, I&#8217;ve gone through my email account and extracted a whole bunch of emails I have received from Chris at Finxter, the last 200 to be precise. This very first step, I have done for you, as I obviously cannot give you all access to my personal email inbox! But I will still cover roughly the steps taken:</p>



<ol class="wp-block-list">
<li>I&#8217;ve applied a label to all the emails I wanted to extract from my inbox, so I could easily find them.</li>



<li>I went to <a href="https://takeout.google.com/">Google Takeout</a> and requested a download of all my emails with that label.</li>



<li>I received a file with all my emails in MBOX format.</li>



<li>I wrote a Python script, <code>mbox_to_json_decode.py</code>, which takes the emails, decodes them, takes all my personal unsubscribe links and other personal data out, and then writes them to a JSON file.</li>
</ol>



<p>As this MBOX to JSON conversion is hyper-specific, and the MBOX file contains some of my personal data, this is the only step along the way we will skip, as the chances that you will also have to convert MBOX files to JSON are very slim and I want to keep this course relevant. If you do need information on MBOX to JSON conversion, I will add the <code>mbox_to_json_decode</code> script in the GitHub repository so you can check it out if you need to.</p>



<p>So now we are left with <code>Finx_dataset.json</code>, which will be our entry point for this tutorial. Normally I would include this file in the GitHub repository, but as it has a large amount of the Finxter email data, we have elected to not include it in the repository. Instead, the file will be available for download from the course page in the Finxter Academy. If you haven&#8217;t downloaded it yet, please do so now.</p>



<p>Then go ahead and create a base project folder to use for this course, I&#8217;ve named mine <code>Finx_Fine_Tuning</code>, and then create a folder named <code>data</code> inside of it. Then move the <code>Finx_dataset.json</code> file into the <code>data</code> folder to create the following structure:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />Finx_Fine_Tuning
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />data
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />Finx_dataset.json</pre>



<h3 class="wp-block-heading">Create a venv in the root project folder</h3>



<p>Ok, just a small detour before we continue with our project!</p>



<p>We&#8217;ll be running this project inside a virtual environment. A virtual environment is a self-contained directory that will allow us to install specific versions of packages inside the virtual environment without affecting the global Python installation.</p>



<p>We will use this as I will be using specific versions for the libraries we install as we go along, and I want to make sure that you have the exact same experience as I do. The virtual environment will make it easy for you to install my exact versions without worrying about affecting any of your other projects.</p>



<p>To create a new virtual environment we&#8217;ll use a tool called <code>pipenv</code>. If you don&#8217;t have <code>pipenv</code> installed, you can install it using pip, which is Python&#8217;s package manager. Run the following command in your terminal:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">pip install pipenv</pre>



<p>Make sure the terminal is inside your root project folder, e.g. <code>/c/Coding_Vault/Finx_Fine_Tuning</code>, and then run the following command to create a new virtual environment:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">pipenv shell</pre>



<p>This will create a new virtual environment and also a <code>Pipfile</code> in your project directory. Any packages you install using <code>pipenv install</code> will be added to the <code>Pipfile</code>.</p>



<ol class="wp-block-list" start="3">
<li>To generate a <code>Pipfile.lock</code>, which is used to produce deterministic builds, run:</li>
</ol>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">pipenv lock</pre>



<p>This will create a <code>Pipfile.lock</code> in your project directory, which contains the exact version of each dependency to ensure that future installs are able to replicate the same environment.</p>



<p>We don&#8217;t need to install a library first to create a <code>Pipfile.lock</code>. From now on when we install a library in this virtual environment with <code>pipenv install library_name</code>, they will be added to the <code>Pipfile</code> and <code>Pipfile.lock</code>.</p>



<h3 class="wp-block-heading">Back to our data</h3>



<p>Back to where we were. Our root project folder should now look like this:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />Finx_Fine_Tuning
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />data
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />Finx_dataset.json
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />Pipfile
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />Pipfile.lock</pre>



<p>Let&#8217;s go ahead and take a look at the <code>Finx_dataset.json</code> file we downloaded earlier to see what kind of raw data we are working with here:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">[
  {
    "subject": "5 Proxies to Investing in OpenAI",
    "body": "&lt;html>\n&lt;head>\n\t&lt;title>&lt;/title>\n&lt;/head>\n&lt;body data-gr-ext-installed=\"\" data-new-gr-c-s-check-loaded=\"8.909.0\" data-new-gr-c-s-loaded=\"8.909.0\" style=\"font-family:Arial;font-size:16px;\">\n&lt;p style=\"text-align: center;\">&lt;a href=\"{Link}\" target=\"_blank\">&lt;img alt=\"\" height=\"39\" src=\"{Link}\" width=\"153\" />&lt;/a>&lt;/p>\n\n&lt;p>\u00a0&lt;/p>\n\n&lt;p>Hey {User},&lt;/p>\n\n&lt;p>To profit from change, we need to increase ownership of disruptive trends. Today's article covers a question that many Finxters frequently ask:&lt;/p>\n\n&lt;p>\ud83e\udeb4 [&lt;strong>Blog&lt;/strong>] &lt;a href=\"{Link}\">How to Invest in OpenAI?&lt;/a> \ud83c\udf33&lt;/p>\n\n&lt;p>While it's not possible to invest in OpenAI directly, the blog discusses five alternatives:&lt;/p>\n\n&lt;ul>\n\t&lt;li>&lt;strong>MSFT &lt;/strong>(49% stake in OpenAI),&lt;/li>\n\t&lt;li>&lt;strong>NVIDIA &lt;/strong>(makes more revenue from OpenAI than any other company),&lt;/li>\n\t&lt;li>&lt;strong>ARKVX &lt;/strong>(&lt;em>Anthropic!&lt;/em>),&lt;/li>\n\t&lt;li>&lt;strong>META &lt;/strong>(&lt;em>Llama 2!&lt;/em>), and&lt;/li>\n\t&lt;li>&lt;strong>TSLA &lt;/strong>(Optimus!).&lt;/li>\n&lt;/ul>\n\n&lt;p>Check it out if you're interested in any of those! No financial advice. \ud83d\ude0a&lt;/p>\n\n&lt;p>Be on the right side of change. \ud83d\ude80&lt;br />\nChris&lt;/p>\n\n&lt;p>\u00a0&lt;/p>\n\n&lt;p>\u00a0&lt;/p>\n\n&lt;p>&lt;strong>\u2665\ufe0f Community Corner: Featured Resources&lt;/strong>&lt;/p>\n\n&lt;ul>\n\t&lt;li>&lt;a href=\"{Link}\">TradeUnafraid&lt;/a> is a trading platform owned and operated by Finxter community member Lee.&lt;/li>\n&lt;/ul>\n\n&lt;p>Do you want to feature your own startup, YouTube channel, blog, or website as a &lt;a href=\"{Link}\">Finxter premium member&lt;/a>? Hit reply and let me know!&lt;/p>\n\n&lt;p>\u00a0&lt;/p>\n\n&lt;p>\u00a0&lt;/p>\n\n&lt;p>\u00a0&lt;/p>\n\n&lt;div style=\"background:#eeeeee;border:1px solid #fcfcfc;padding:20px 20px;\">\n&lt;p>&lt;span>&lt;strong>&lt;a href=\"{Link}\">How are we doing?&lt;/a>&lt;/strong>&lt;br />\n&lt;a href=\"{Link}\">\u2b50&lt;/a>&lt;br />\n&lt;a href=\"{Link}\">\u2b50\u2b50&lt;/a>&lt;br />\n&lt;a href=\"{Link}\">\u2b50\u2b50\u2b50&lt;/a>&lt;br />\n&lt;br />\nTo make sure you keep getting these emails, please add &lt;em>chris@finxter.com&lt;/em> to your address book.&lt;br />\n&lt;br />\nI'd love to hear your feedback so that I can improve this free email course over time. Please reply to this email and share everything on your mind!&lt;br />\n&lt;br />\n&lt;strong>If you find the Finxter Email Academy useful, please invite a friend or colleague! \u2764&lt;/strong>&lt;/span>&lt;/p>\n\n&lt;p>&lt;br />\n&lt;span>Here's the subscription link you can share:&lt;br />\n&lt;a href=\"{Link}\" target=\"_blank\">https://blog.finxter.com/subscribe/&lt;/a>&lt;br />\n&lt;br />\nDownload the Ultimate Python Cheat Sheet here (direct PDF download): \ud83d\udc0d&lt;/span>&lt;/p>\n\n&lt;p>&lt;span>&lt;strong>&lt;a href=\"{Link}\" target=\"_blank\">The Ultimate Python Cheat Sheet&lt;/a>&lt;/strong>&lt;br />\n&lt;br />\nNot very motivated to learn today? Consider this:&lt;br />\n&lt;strong>&lt;em>\"Knowledge compounds!\"&lt;/em>&lt;/strong> -- Warren Buffett&lt;br />\n&lt;br />\nConsequently, if you improve your skills by 1% every day, you'll 36x your programming skills within a year!&lt;/span>&lt;/p>\n&lt;/div>\n\n&lt;p>\u00a0&lt;/p>\n\n&lt;p>\u00a0&lt;/p>\n\n&lt;p>\u00a0&lt;/p>\n\n&lt;p>&lt;br />\n&lt;em>&lt;strong>&lt;span>Finxter, Dr. Christian Mayer&lt;/span>&lt;/strong>&lt;br />\n&lt;span>{Address}., {City}, {Country}&lt;/span>&lt;/em>&lt;/p>\n\n&lt;p>&lt;span>Want out of the loop? I'm so sad to see you go. \ud83d\ude22 How could we have done better? &lt;/span>&lt;br />\n&lt;span>To help future Finxters, please hit reply and tell us! \ud83e\udd17&lt;/span>&lt;/p>\n&lt;a href=\"{Link}\" >Unsubscribe here&lt;/a>\n\n&lt;p>\u00a0&lt;/p>\n\n&lt;p>\u00a0&lt;/p>\n\n&lt;p>\u00a0&lt;/p>\n&lt;/body>\n&lt;/html>\n&lt;img src=\"{Link}\" alt=\"\" style=\"width:1px;height:1px;\"/>\n"
  },
  {
    "subject": "Tech Deflation vs Inflation",
    "body": "Email no2..."
  }</pre>



<p>As you can see, we have a list of objects, each with a <code>subject</code> and <code>body</code> key. The <code>body</code> key contains the raw HTML of the email, which we will need to clean up a bit before using it for our purposes. The only preprocessing I&#8217;ve done in the MBOX to JSON conversion is removing links and personal data for generic {Link} and {User} placeholders.</p>



<p>If you&#8217;re wondering what the \uxxxx characters are, like the sequence \ud83d\udc0d, they are Unicode escape sequences that represent characters in the Unicode standard. Specifically, this sequence represents the &#8220;snake&#8221; emoji (<img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f40d.png" alt="🐍" class="wp-smiley" style="height: 1em; max-height: 1em;" />). You will see these quite a lot as Chris is of course famous for his creative emoji usage!</p>



<p>The full list has about 200 of these email objects, in non-chronological order. If you scroll through the data, you will see some noise in there, which will be reflected in our final product. For the purposes of this tutorial, it will be good enough. For professional use, you&#8217;d want to make sure to clean up the data more thoroughly, spending some more time here.</p>



<h3 class="wp-block-heading">Preparing our data</h3>



<p>We now have our basic data, and we know what kind of format we need for the training data, like the Harry Potter magical spells example we showed. Now let&#8217;s start wrangling the data into the format we need. As with all complex coding tasks let&#8217;s take it one step at a time, and let&#8217;s build our solution in small and reusable parts.</p>



<p>We&#8217;ll start with a utility to convert the email above into a more readable and simple format. Instead of the HTML above with all the emojis in a format that we cannot even read and loads of HTML tags all over the place, let&#8217;s have a utility function that takes that HTML email as input and returns a simple and readable markdown format version for us to work with instead.</p>



<p>So go ahead and create a new folder named <code>utils</code> in the root project folder, and then create a new file named <code>html_email.py</code> inside the utils folder:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />Finx_Fine_Tuning
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />data
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />Finx_dataset.json
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />utils
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />html_email.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />Pipfile
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />Pipfile.lock</pre>



<p>Now before we get started on the <code>html_email.py</code> file, we&#8217;ll need to install a library called <code>html2text</code> which will help us convert the HTML emails to markdown. Someone has already written a library to do this for us, so we don&#8217;t have to write it ourselves. Always use existing solutions when they exist to speed up your development cycle!</p>



<p>To install a specific version of a package in our <code>Pipenv</code> environment, you can use the <code>pipenv install</code> command followed by the package name and the version number. Run the following command:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">pipenv install html2text==2020.1.16</pre>



<p>This command will add <code>html2text</code> to our <code>Pipfile</code> under the <code>[packages]</code> section with the specified version. It will also update your <code>Pipfile.lock</code> to include the exact version of <code>html2text</code> and its dependencies.</p>



<p>Now let&#8217;s go ahead and open the <code>html_email.py</code> file and add the following code:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">import html2text

def html_to_markdown(html: str) -> str:
    html = html.encode("utf-16", "surrogatepass").decode("utf-16")

    html_to_text_converter = html2text.HTML2Text()
    html_to_text_converter.ignore_links = False
    return html_to_text_converter.handle(html)</pre>



<p>We first import the library we have just installed. Then we define a function <code>html_to_markdown</code> which takes an HTML string as input and returns a markdown string.</p>



<p>We then take the <code>html</code> variable, which is a string, and we<br>will convert any Unicode escape sequences in the string back into their corresponding characters. The <code>"surrogatepass"</code> error handler instructs Python to properly handle any surrogate characters in the string so that for the <code>\ud83d\ude80</code> patterns we talked about earlier, after running this line, they will be turned into the corresponding emoji characters (in this case, the rocket emoji <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f680.png" alt="🚀" class="wp-smiley" style="height: 1em; max-height: 1em;" />).</p>



<p>This works because the <code>.encode</code> method converts the string to bytes using UTF-16 encoding, which includes converting Unicode escape sequences to their actual Unicode characters. Then, the <code>.decode</code> method converts those bytes back into a string, preserving the Unicode characters. So we basically did a round-trip conversion from Unicode escape sequences to actual Unicode characters.</p>



<p>We then create an instance of the <code>HTML2Text</code> class and set the <code>ignore_links</code> attribute to <code>False</code> to include links in the output. We then call the <code>handle</code> method of the <code>HTML2Text</code> instance and pass the HTML string as an argument to convert it to markdown, and simply return the result.</p>



<h3 class="wp-block-heading">Let&#8217;s test it out</h3>



<p>Let&#8217;s go ahead and give it a test run. Above the <code>html_to_markdown</code> function, add the following variable holding a test email string:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">test_email = '&lt;html>\n&lt;head>\n\t&lt;title>&lt;/title>\n&lt;/head>\n&lt;body data-gr-ext-installed="" data-new-gr-c-s-check-loaded="8.909.0" data-new-gr-c-s-loaded="8.909.0" style="font-family:Arial;font-size:16px;">\n&lt;p style="text-align: center;">&lt;a href="{Link}" target="_blank">&lt;img alt="" height="39" src="{Link}" width="153" />&lt;/a>&lt;/p>\n\n&lt;p>\u00a0&lt;/p>\n\n&lt;p>Hey {User},&lt;/p>\n\n&lt;p>To profit from change, we need to increase ownership of disruptive trends. Today\'s article covers a question that many Finxters frequently ask:&lt;/p>\n\n&lt;p>\ud83e\udeb4 [&lt;strong>Blog&lt;/strong>] &lt;a href="{Link}">How to Invest in OpenAI?&lt;/a> \ud83c\udf33&lt;/p>\n\n&lt;p>While it\'s not possible to invest in OpenAI directly, the blog discusses five alternatives:&lt;/p>\n\n&lt;ul>\n\t&lt;li>&lt;strong>MSFT &lt;/strong>(49% stake in OpenAI),&lt;/li>\n\t&lt;li>&lt;strong>NVIDIA &lt;/strong>(makes more revenue from OpenAI than any other company),&lt;/li>\n\t&lt;li>&lt;strong>ARKVX &lt;/strong>(&lt;em>Anthropic!&lt;/em>),&lt;/li>\n\t&lt;li>&lt;strong>META &lt;/strong>(&lt;em>Llama 2!&lt;/em>), and&lt;/li>\n\t&lt;li>&lt;strong>TSLA &lt;/strong>(Optimus!).&lt;/li>\n&lt;/ul>\n\n&lt;p>Check it out if you\'re interested in any of those! No financial advice. \ud83d\ude0a&lt;/p>\n\n&lt;p>Be on the right side of change. \ud83d\ude80&lt;br />\nChris&lt;/p>\n\n&lt;p>\u00a0&lt;/p>\n\n&lt;p>\u00a0&lt;/p>\n\n&lt;p>&lt;strong>\u2665\ufe0f Community Corner: Featured Resources&lt;/strong>&lt;/p>\n\n&lt;ul>\n\t&lt;li>&lt;a href="{Link}">TradeUnafraid&lt;/a> is a trading platform owned and operated by Finxter community member Lee.&lt;/li>\n&lt;/ul>\n\n&lt;p>Do you want to feature your own startup, YouTube channel, blog, or website as a &lt;a href="{Link}">Finxter premium member&lt;/a>? Hit reply and let me know!&lt;/p>\n\n&lt;p>\u00a0&lt;/p>\n\n&lt;p>\u00a0&lt;/p>\n\n&lt;p>\u00a0&lt;/p>\n\n&lt;div style="background:#eeeeee;border:1px solid #fcfcfc;padding:20px 20px;">\n&lt;p>&lt;span>&lt;strong>&lt;a href="{Link}">How are we doing?&lt;/a>&lt;/strong>&lt;br />\n&lt;a href="{Link}">\u2b50&lt;/a>&lt;br />\n&lt;a href="{Link}">\u2b50\u2b50&lt;/a>&lt;br />\n&lt;a href="{Link}">\u2b50\u2b50\u2b50&lt;/a>&lt;br />\n&lt;br />\nTo make sure you keep getting these emails, please add &lt;em>chris@finxter.com&lt;/em> to your address book.&lt;br />\n&lt;br />\nI\'d love to hear your feedback so that I can improve this free email course over time. Please reply to this email and share everything on your mind!&lt;br />\n&lt;br />\n&lt;strong>If you find the Finxter Email Academy useful, please invite a friend or colleague! \u2764&lt;/strong>&lt;/span>&lt;/p>\n\n&lt;p>&lt;br />\n&lt;span>Here\'s the subscription link you can share:&lt;br />\n&lt;a href="{Link}" target="_blank">https://blog.finxter.com/subscribe/&lt;/a>&lt;br />\n&lt;br />\nDownload the Ultimate Python Cheat Sheet here (direct PDF download): \ud83d\udc0d&lt;/span>&lt;/p>\n\n&lt;p>&lt;span>&lt;strong>&lt;a href="{Link}" target="_blank">The Ultimate Python Cheat Sheet&lt;/a>&lt;/strong>&lt;br />\n&lt;br />\nNot very motivated to learn today? Consider this:&lt;br />\n&lt;strong>&lt;em>"Knowledge compounds!"&lt;/em>&lt;/strong> -- Warren Buffett&lt;br />\n&lt;br />\nConsequently, if you improve your skills by 1% every day, you\'ll 36x your programming skills within a year!&lt;/span>&lt;/p>\n&lt;/div>\n\n&lt;p>\u00a0&lt;/p>\n\n&lt;p>\u00a0&lt;/p>\n\n&lt;p>\u00a0&lt;/p>\n\n&lt;p>&lt;br />\n&lt;em>&lt;strong>&lt;span>Finxter, Dr. Christian Mayer&lt;/span>&lt;/strong>&lt;br />\n&lt;span>{Address}., {City}, {Country}&lt;/span>&lt;/em>&lt;/p>\n\n&lt;p>&lt;span>Want out of the loop? I\'m so sad to see you go. \ud83d\ude22 How could we have done better? &lt;/span>&lt;br />\n&lt;span>To help future Finxters, please hit reply and tell us! \ud83e\udd17&lt;/span>&lt;/p>\n&lt;a href="{Link}" >Unsubscribe here&lt;/a>\n\n&lt;p>\u00a0&lt;/p>\n\n&lt;p>\u00a0&lt;/p>\n\n&lt;p>\u00a0&lt;/p>\n&lt;/body>\n&lt;/html>\n&lt;img src="{Link}" alt="" style="width:1px;height:1px;"/>\n'</pre>



<p>Just copy it from the written version of the tutorial, and make sure you insert it above the function we wrote:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">import html2text

test_email = ...

def html_to_markdown(html: str) -> str:
    ...</pre>



<p>Now, below the <code>html_to_markdown</code> function, add the following code to test the function:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">if __name__ == "__main__":
    markdown_content = html_to_markdown(test_email)

    with open("test.md", "w", encoding="utf-8") as file:
        file.write(markdown_content)</pre>



<p>This code will run the <code>html_to_markdown</code> function with the <code>test_email</code> string as input, and then write the result to a file named <code>test.md</code>. The <code>if __name__ == "__main__":</code> line ensures that the code inside the block only runs when the script is executed directly, and not when we import the <code>html_to_markdown</code> function into another script later on.</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4a1.png" alt="💡" class="wp-smiley" style="height: 1em; max-height: 1em;" /> Python Top-tip <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4a1.png" alt="💡" class="wp-smiley" style="height: 1em; max-height: 1em;" />
In Python, when a script is run, a special built-in variable called __name__ is set to "__main__". However, if a module is imported, __name__ is set to the module's name instead. By checking if __name__ == "__main__":, the script can determine whether it's being run directly or being imported as a module.

This allows for a flexible way to organize your code. You can put code that tests the functionality of the module or demonstrates how to use the module under this if statement. When the module is imported, this code won't run, but when the script is run directly, the code will execute. This is particularly useful for unit testing or for scripts that can be used both as utility modules and as standalone programs.</pre>



<p>Now go ahead and run the script and a new file named <code>test.md</code> will be created. If you check it out it will have the markdown version of the email we provided as input.</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">[![]({Link})]({Link})

Hey {User},

To profit from change, we need to increase ownership of disruptive trends.
Today's article covers a question that many Finxters frequently ask:

<img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1fab4.png" alt="🪴" class="wp-smiley" style="height: 1em; max-height: 1em;" /> [ **Blog** ] [How to Invest in OpenAI?]({Link}) <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f333.png" alt="🌳" class="wp-smiley" style="height: 1em; max-height: 1em;" />

While it's not possible to invest in OpenAI directly, the blog discusses five
alternatives:

  * **MSFT** (49% stake in OpenAI),
  * **NVIDIA** (makes more revenue from OpenAI than any other company),
  * **ARKVX** ( _Anthropic!_ ),
  * **META** ( _Llama 2!_ ), and
  * **TSLA** (Optimus!).

Check it out if you're interested in any of those! No financial advice. <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f60a.png" alt="😊" class="wp-smiley" style="height: 1em; max-height: 1em;" />

Be on the right side of change. <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f680.png" alt="🚀" class="wp-smiley" style="height: 1em; max-height: 1em;" />
Chris

**<img src="https://s.w.org/images/core/emoji/17.0.2/72x72/2665.png" alt="♥" class="wp-smiley" style="height: 1em; max-height: 1em;" /> Community Corner: Featured Resources**

  * [TradeUnafraid]({Link}) is a trading platform owned and operated by Finxter community member Lee.

Do you want to feature your own startup, YouTube channel, blog, or website as
a [Finxter premium member]({Link})? Hit reply and let me know!

**[How are we doing?]({Link})**
[<img src="https://s.w.org/images/core/emoji/17.0.2/72x72/2b50.png" alt="⭐" class="wp-smiley" style="height: 1em; max-height: 1em;" />]({Link})
[<img src="https://s.w.org/images/core/emoji/17.0.2/72x72/2b50.png" alt="⭐" class="wp-smiley" style="height: 1em; max-height: 1em;" /><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/2b50.png" alt="⭐" class="wp-smiley" style="height: 1em; max-height: 1em;" />]({Link})
[<img src="https://s.w.org/images/core/emoji/17.0.2/72x72/2b50.png" alt="⭐" class="wp-smiley" style="height: 1em; max-height: 1em;" /><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/2b50.png" alt="⭐" class="wp-smiley" style="height: 1em; max-height: 1em;" /><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/2b50.png" alt="⭐" class="wp-smiley" style="height: 1em; max-height: 1em;" />]({Link})</pre>



<p>If we render this properly as markdown it will result in the following look:</p>



<p><strong>###########################START##########################</strong><br><a href="{Link}"></a></p>



<p>Hey {User},</p>



<p>To profit from change, we need to increase ownership of disruptive trends.<br>Today&#8217;s article covers a question that many Finxters frequently ask:</p>



<p><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1fab4.png" alt="🪴" class="wp-smiley" style="height: 1em; max-height: 1em;" /> [ <strong>Blog</strong> ] <a href="{Link}">How to Invest in OpenAI?</a> <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f333.png" alt="🌳" class="wp-smiley" style="height: 1em; max-height: 1em;" /></p>



<p>While it&#8217;s not possible to invest in OpenAI directly, the blog discusses five<br>alternatives:</p>



<ul class="wp-block-list">
<li><strong>MSFT</strong> (49% stake in OpenAI),</li>



<li><strong>NVIDIA</strong> (makes more revenue from OpenAI than any other company),</li>



<li><strong>ARKVX</strong> ( <em>Anthropic!</em> ),</li>



<li><strong>META</strong> ( <em>Llama 2!</em> ), and</li>



<li><strong>TSLA</strong> (Optimus!).</li>
</ul>



<p>Check it out if you&#8217;re interested in any of those! No financial advice. <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f60a.png" alt="😊" class="wp-smiley" style="height: 1em; max-height: 1em;" /></p>



<p>Be on the right side of change. <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f680.png" alt="🚀" class="wp-smiley" style="height: 1em; max-height: 1em;" /><br>Chris</p>



<p><strong><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/2665.png" alt="♥" class="wp-smiley" style="height: 1em; max-height: 1em;" /> Community Corner: Featured Resources</strong></p>



<ul class="wp-block-list">
<li><a href="{Link}">TradeUnafraid</a> is a trading platform owned and operated by Finxter community member Lee.</li>
</ul>



<p>Do you want to feature your own startup, YouTube channel, blog, or website as<br>a <a href="{Link}">Finxter premium member</a>? Hit reply and let me know!</p>



<p><strong><a href="{Link}">How are we doing?</a></strong><br><a href="{Link}"><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/2b50.png" alt="⭐" class="wp-smiley" style="height: 1em; max-height: 1em;" /></a><br><a href="{Link}"><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/2b50.png" alt="⭐" class="wp-smiley" style="height: 1em; max-height: 1em;" /><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/2b50.png" alt="⭐" class="wp-smiley" style="height: 1em; max-height: 1em;" /></a><br><a href="{Link}"><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/2b50.png" alt="⭐" class="wp-smiley" style="height: 1em; max-height: 1em;" /><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/2b50.png" alt="⭐" class="wp-smiley" style="height: 1em; max-height: 1em;" /><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/2b50.png" alt="⭐" class="wp-smiley" style="height: 1em; max-height: 1em;" /></a></p>



<p><strong>###########################END##########################</strong></p>



<p>Which is good enough for our purposes for this tutorial. We will be using this markdown version of the emails as our training data for the fine-tuning process. We could go and clean up even further to have cleaner output, but for the purposes of this tutorial, this will be good enough.</p>



<p>Now that we have our HTML to Markdown function prepared, we&#8217;ll continue in part 2, where we will generate the actual training data for our fine-tuning of ChrisGPT. I&#8217;ll see you in part 2!</p>



<p class="has-base-2-background-color has-background"><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4a1.png" alt="💡" class="wp-smiley" style="height: 1em; max-height: 1em;" /> <strong>Tip</strong>: This is a full-text tutorial on how to fine-tune ChatGPT using the OpenAI API with code! For a video guide-through, check out our premium course &#8220;<a href="https://academy.finxter.com/university/fine-tuning-openai-course/">Fine-Tuning OpenAI: How to Create and Scale Your Digital Self (Example: ChrisGPT)</a>&#8220;</p>



<h2 class="wp-block-heading">Part 2: Generating the Training Data</h2>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img decoding="async" width="1024" height="1024" src="https://blog.finxter.com/wp-content/uploads/2024/03/1912e888-0079-416d-9a5c-c9fb0d8745d0.webp" alt="" class="wp-image-1669705" srcset="https://blog.finxter.com/wp-content/uploads/2024/03/1912e888-0079-416d-9a5c-c9fb0d8745d0.webp 1024w, https://blog.finxter.com/wp-content/uploads/2024/03/1912e888-0079-416d-9a5c-c9fb0d8745d0-300x300.webp 300w, https://blog.finxter.com/wp-content/uploads/2024/03/1912e888-0079-416d-9a5c-c9fb0d8745d0-150x150.webp 150w, https://blog.finxter.com/wp-content/uploads/2024/03/1912e888-0079-416d-9a5c-c9fb0d8745d0-768x768.webp 768w" sizes="(max-width: 1024px) 100vw, 1024px" /></figure>
</div>


<p>Hi and welcome back to part 2, where we will be generating the training data for our own ChrisGPT model! We&#8217;ll pick up where we left off in part 1 and use our HTML to Markdown converter utility and more to write and run our training dataset generator that will build the ChrisGPT training data for us.</p>



<h3 class="wp-block-heading">Api Key</h3>



<p>We need to start by setting our key for the OpenAI API, but we cannot hardcode this one in our source code. Go to https://platform.openai.com/api-keys and copy your API key. If you don&#8217;t have one, make sure to get one. You&#8217;ll only pay for what you use which will be cents if you just play around with it casually. Then create a new file called <code>.env</code> in the root folder of your project:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />Finx_Fine_Tuning
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />data
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />Finx_dataset.json
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />utils
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />html_email.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />.env                  (new file)
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />Pipfile
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />Pipfile.lock</pre>



<p>And paste your API key in there like this, making sure not to use any spaces or quotes:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">OPENAI_API_KEY=your_api_key_here</pre>



<p>Then go ahead and save and close this file. (You can also use environment variables to set the key if you prefer, but this is simpler to set up for tutorial purposes as it works the same on all platforms.)</p>



<h3 class="wp-block-heading">Constants</h3>



<p>Now, let&#8217;s define some project-wide settings by creating a new file named <code>constants.py</code> in the root directory of our project. This file will contain the constants that we will use multiple times throughout our project:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />Finx_Fine_Tuning
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />data
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />Finx_dataset.json
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />utils
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />html_email.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />.env
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />constants.py        (new file)
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />Pipfile
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />Pipfile.lock</pre>



<p>Run the following command in your terminal to add the <code>python-decouple</code> package inside your <code>pipenv</code> environment:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">pipenv install python-decouple==3.7</pre>



<p>We will use this package to read the <code>.env</code> file and get the API key from it. Then install the openai library as well:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">pipenv install openai==1.12.0</pre>



<p>Make sure you run the installs even if you already have the packages installed, as we need to make sure they are installed in the virtual environment we are using for this project. Now, open the <code>constants.py</code> file and add the following code:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">from pathlib import Path

from decouple import config
from openai import OpenAI

CLIENT = OpenAI(api_key=str(config("OPENAI_API_KEY")))
DATA_DIRECTORY = Path(__file__).parent / "data"</pre>



<p>We import <code>Path</code> from <code>pathlib</code>, the <code>OpenAI</code> class from the <code>openai</code> package, and the <code>config</code> function from the <code>decouple</code> package. We then create a <code>CLIENT</code> constant that will be an instance of the <code>OpenAI</code> class, passing the API key from the <code>.env</code> file to it by reading it from the <code>.env</code> file using the <code>config</code> function.</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4a1.png" alt="💡" class="wp-smiley" style="height: 1em; max-height: 1em;" /> Python Top-tip <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4a1.png" alt="💡" class="wp-smiley" style="height: 1em; max-height: 1em;" />
The 'config' function from the 'decouple' package reads the value of the environment variable from the '.env' file and returns it as a string. If you share or upload your code to a public repository, make sure to add the '.env' file to your '.gitignore' file to avoid sharing your API key with others.</pre>



<p>We also create a <code>DATA_DIRECTORY</code> constant that will be a <code>Path</code> object pointing to the <code>data</code> directory. <code>Path(__file__)</code> returns the absolute path of the current file, and <code>parent</code> returns the parent directory of the current file (which is <code>constants.py</code> so that gives us the root folder). We then use the <code>/</code> operator to add the <code>data</code> directory to the path.</p>



<h3 class="wp-block-heading">The Data Generator</h3>



<p>Go ahead and save and close the <code>constants.py</code> file. Then create a new file in the root directory called <code>chris_gpt_dataset_generator.py</code>:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />Finx_Fine_Tuning
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />data
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />Finx_dataset.json
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />utils
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />html_email.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />.env
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />constants.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />chris_gpt_dataset_generator.py        (new file)
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />Pipfile
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />Pipfile.lock</pre>



<p>Now let&#8217;s install two libraries before we dive in here. We&#8217;ll combine them into a single command this time:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">pipenv install tqdm==4.65.0 &amp;&amp; pipenv install tenacity==8.2.3</pre>



<p>The <code>tqdm</code> package will be used to easily add a progress bar to our generator, and the <code>tenacity</code> package will be used to retry the API requests if they fail for some reason. Both are extremely simple in their usage, as you will see.</p>



<p>Ok now open the <code>chris_gpt_dataset_generator.py</code> file and we will start with our imports for the file:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">import json

from tenacity import retry, stop_after_attempt, wait_fixed
from tqdm import tqdm

from constants import CLIENT, DATA_DIRECTORY
from utils import html_email</pre>



<p>We use <code>json</code> to load and save back the JSON data, <code>tenacity</code> and <code>tqdm</code> for the reasons we just mentioned, and then we import the <code>CLIENT</code> and <code>DATA_DIRECTORY</code> constants from the <code>constants</code> module, and our <code>html_email</code> utility we just made. Now let&#8217;s set up some base constants for use in our generator only:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">INPUT_DATA = DATA_DIRECTORY / "Finx_dataset.json"
OUTPUT_DATA = DATA_DIRECTORY / "Finx_completed_dataset.json"
MODEL: str = "gpt-3.5-turbo-0125"
TOTAL_TOKENS_USED = 0
RATE_ERROR_MESSAGE = "There was an error calling 'get_user_query'. Perhaps the OpenAI ChatGPT rate limit has been reached. Retrying one more time in 60 seconds to reset the rate limiter..."</pre>



<p>We define <code>INPUT_DATA</code> and <code>OUTPUT_DATA</code> constants that will be <code>Path</code> objects pointing to the <code>Finx_dataset.json</code> and <code>Finx_completed_dataset.json</code> files, respectively. The output one doesn&#8217;t exist yet obviously, but that is fine. The <code>MODEL</code> constant is just a string with the model name we use.</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4a1.png" alt="💡" class="wp-smiley" style="height: 1em; max-height: 1em;" /> OpenAI Top-tip <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4a1.png" alt="💡" class="wp-smiley" style="height: 1em; max-height: 1em;" />
There are many versions of the gpt-3.5-turbo model:

<img src="https://s.w.org/images/core/emoji/17.0.2/72x72/2728.png" alt="✨" class="wp-smiley" style="height: 1em; max-height: 1em;" /> gpt-3.5-turbo-0613. This one is the older version.
<img src="https://s.w.org/images/core/emoji/17.0.2/72x72/2728.png" alt="✨" class="wp-smiley" style="height: 1em; max-height: 1em;" /> gpt-3.5-turbo-1106. Has much improved functionality and a larger context window compared to the 0613 model. This is the newest version that can be used for fine-tuning purposes.
<img src="https://s.w.org/images/core/emoji/17.0.2/72x72/2728.png" alt="✨" class="wp-smiley" style="height: 1em; max-height: 1em;" /> gpt-3.5-turbo-0125. The latest minor update to the 1106 model, this one cannot be used for fine-tuning yet. Aside from the minor improvements, it also has a reduced price.

This is why we will use the `0125` version for the data generation in this part of the tutorial, but later when we do the fine-tuning we will be using the `1106` version instead.</pre>



<p>We also define a <code>TOTAL_TOKENS_USED</code> constant that will be used to keep track of the total number of tokens used in the API requests, and a <code>RATE_ERROR_MESSAGE</code> that we reference for an error message later. As all of the above constants will only be used for this script we defined them all inside the <code>chris_gpt_dataset_generator.py</code> file.</p>



<h3 class="wp-block-heading">The data structure</h3>



<p>Now we&#8217;ll define a class that will serve as the data structure &#8220;blueprint&#8221; if you will, for each example in our dataset. This is exactly the same structure we used in the Harry Potter magical spells example in part 1:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">class TrainingDataEntry:
    def __init__(self, fictional_user_query, markdown_email) -> None:
        self.data = {
            "messages": [
                {
                    "role": "system",
                    "content": 'You are a helpful assistant that writes emails for the Finxter email newsletter, adhering perfectly to the style and tone of the Finxter brand and Chris\' writing style. You will respond in the following format: {"subject": "The subject of the email", "body": "The body of the email in Markdown formatting"}.',
                },
                {"role": "user", "content": fictional_user_query},
                {"role": "assistant", "content": markdown_email},
            ]
        }</pre>



<p>We have a class named <code>TrainingDataEntry</code>, which has an <code>__init__</code> method that takes two arguments: <code>fictional_user_query</code> and <code>markdown_email</code>. The <code>__init__</code> method initializes the <code>data</code> attribute of the <code>TrainingDataEntry</code> object with a dictionary that has a <code>messages</code> key, which is a list of dictionaries.</p>



<p>The first dictionary in the list has a <code>role</code> key with the value <code>"system"</code> and a <code>content</code> key with a long string that explains the role of the assistant. This is going to be the same for all examples of the training data so I just hard-coded it in the class.</p>



<p>The second dictionary in the list has a <code>role</code> key with the value <code>"user"</code> and a <code>content</code> key with the value of the <code>fictional_user_query</code> argument. The third dictionary in the list has a <code>role</code> key with the value <code>"assistant"</code> and a <code>content</code> key with the value of the <code>markdown_email</code> argument.</p>



<p>So the <code>system</code> message is the same for each entry and the <code>assistant</code> message is simply the email that we already have saved in our dataset and can convert using the HTML to Markdown converter we built.</p>



<p>The only challenge we have left is the <code>user</code> message, which is supposed to hold the user query that resulted in the generated email. We will have to generate a fictional user query for each &#8220;result&#8221; email we have in our dataset. This should mostly be a short summary of the email contents for our case, as we want to train ChrisGPT to generate an email if we give it the topics to write on.</p>



<h3 class="wp-block-heading">Getting the user query (that was never made)</h3>



<p>So let&#8217;s create a function that will take an email as input and generate a fictional user query for it:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">def get_fictional_user_query(email: str) -> str:
    global TOTAL_TOKENS_USED
    response = CLIENT.chat.completions.create(
        model=MODEL,
        messages=[
            {
                "role": "system",
                "content": 'You will be provided with an email in the following format:{"subject": "The subject of the email", "body": "The body of the email in Markdown formatting"}. Your task is to go back in time and write a rough draft of the topics the email writer decided to discuss in the email. This will basically be a summary of the email\'s topics formatted in short bullet points, that the author would have used as a basis to then write the real email.',
            },
            {
                "role": "user",
                "content": f"Here is the output LLM generated email: {email}. Bullet point draft of the topics discussed in the email:",
            },
        ],
    )
    if not response.usage or not response.choices[0].message.content:
        raise Exception("Invalid response from OpenAI API")
    TOTAL_TOKENS_USED += response.usage.total_tokens
    return response.choices[0].message.content</pre>



<p>We define a function named <code>get_fictional_user_query</code> that takes an <code>email</code> string argument and will return a string. Inside the function, we first reference the <code>TOTAL_TOKENS_USED</code> variable as a global variable, so that we can update its value from inside the function. Then we call the <code>CLIENT.chat.completions.create</code> method with the <code>model</code> and <code>messages</code> arguments.</p>



<p>For the prompt setup message we basically just explain that the LLM will receive an email in the format of a subject and body and that it should write a rough draft of the topics the email writer decided to discuss in the email. This will basically be a summary of the email&#8217;s topics formatted in short bullet points. That way we can give our trained model a list of bullet points to write and email for us later on.</p>



<p>The <code>user</code> message is then just the email we received as input, and a short message requesting the bullet points draft of the topics discussed. Note how it is stated as &#8220;Bullet point draft of the topics discussed in the email:&#8221; ending with a <code>:</code> colon so that the LLM will answer with the bullet points right away without giving some introduction or &#8220;Sure, I can help you with that&#8221; or something like that.</p>



<p>We then double-check if the <code>response</code> has a <code>usage</code> and <code>choices</code> attribute and if the <code>content</code> of the <code>choices</code> attribute is not empty. If any of these checks fail we raise an exception, but if everything is fine we add the <code>total_tokens</code> used in the response to the <code>TOTAL_TOKENS_USED</code> variable and return the <code>content</code> of the <code>choices</code> attribute.</p>



<h3 class="wp-block-heading">Making it robust</h3>



<p>Ok, so that is pretty good so far, but we will be calling this function many times in a row, and saving the data into an object in memory, which we write to a file at the end. If we get an error halfway then the script will crash out and we have to try again from the start. We can also hit an error because of a rate limit, as we will be doing many requests in a row. This is not ideal, so we will use the <code>tenacity</code> package to retry the API requests if they fail for some reason.</p>



<p>We can simply add the @retry decorator to our function, by adding it straight above the function definition like this:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">@retry(
    wait=wait_fixed(60),
    stop=stop_after_attempt(2),
    reraise=True,
    before_sleep=lambda _: print(RATE_ERROR_MESSAGE),
)
def get_fictional_user_query(email):
    ... function here ...</pre>



<p>We use the <code>@retry</code> decorator to specify that we want to retry the function if it fails due to any type of error. As you can see the <code>tenacity</code> library is very easy to read and use. We specify that we want to wait 60 seconds between each retry and that we want to stop after 2 attempts, maximum.</p>



<p>The 60 seconds are in case we hit a rate limit, as trying again right away might trigger the rate limit again. We also specify that we want to reraise the exception if the function fails after the maximum number of attempts, so we can see what the original error was.</p>



<p>Now the last part is a bit trickier. As <code>tenacity</code> will stop the program from crashing out, we won&#8217;t see the error and we won&#8217;t be aware of what happened. This is why we added a <code>before_sleep</code> function that will print the <code>RATE_ERROR_MESSAGE</code> before each retry. This way we can keep ourselves up to date in the terminal about what is happening.</p>



<p>The problem is that if I just pass in <code>print(RATE_ERROR_MESSAGE)</code> then the print statement will be triggered right away when Python first loads the function into memory because it is being called with the ellipsis. This is not what we want, so we need to wrap it in a lambda function that will then return the print call when the lambda is called.</p>



<p>The <code>lambda _: print(RATE_ERROR_MESSAGE)</code> is basically equivalent to:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""># Do not put in your code
def before_sleep_function(_):
    print(RATE_ERROR_MESSAGE)</pre>



<p>But it&#8217;s just much shorter to write. The <code>_</code> is the name of the input argument, as <code>tenacity</code> will send an input argument to our <code>before_sleep</code> function, but we won&#8217;t be using it, so we just name it <code>_</code> to sort of throw it away.</p>



<p>Ok so now we have a <code>TrainingDataEntry</code> class to hold the data for each entry and a function to generate the fictional user queries that will automatically retry if there is an error. See how we&#8217;re making the problem smaller and smaller?</p>



<h3 class="wp-block-heading">Putting it all together</h3>



<p>All we need to do now is create a <code>TrainingDataEntry</code> object for each entry in the training data, get the fictional user query by calling our function, and then save all the objects to a new JSON file to get our training data. Let&#8217;s do that now:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">with open(INPUT_DATA, "r", encoding="utf-8") as file:
    input_data = json.load(file)
    output_data = []</pre>



<p>We open the <code>INPUT_DATA</code> file in read mode and use the <code>json.load</code> function to load the JSON data from the file into the <code>input_data</code> variable. We then create an empty list and assign it to the <code>output_data</code> variable. So far so good right? Let&#8217;s finish it:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">with open(INPUT_DATA, "r", encoding="utf-8") as file:
    input_data = json.load(file)
    output_data = []
    for finx_email in tqdm(input_data, desc="Generating training data"):
        finx_email["body"] = html_email.html_to_markdown(finx_email["body"])
        training_data_entry = TrainingDataEntry(
            fictional_user_query=get_fictional_user_query(finx_email),
            markdown_email=str(finx_email),
        )
        output_data.append(training_data_entry.data)</pre>



<p>First of all, we open a loop. We&#8217;re going to loop over each <code>finx_email</code> in the <code>input_data</code> list. The reason <code>input_data</code> is wrapped inside <code>tqdm()</code> here is that this will allow us to add a progress bar to the loop. The <code>desc</code> argument is used to specify the description of the progress bar for <code>tqdm</code>, which will be &#8220;Generating training data&#8221;.</p>



<p>Now remember that each <code>finx_email</code> is a dictionary with a <code>subject</code> and <code>body</code> key. We convert the <code>body</code> from HTML to Markdown using our <code>html_to_markdown</code> utility function, and then we create a <code>TrainingDataEntry</code> object for the <code>finx_email</code>. The <code>fictional_user_query</code> is obtained by calling our <code>get_fictional_user_query</code> function with the <code>finx_email</code> as input, and the <code>markdown_email</code> is just the <code>finx_email</code> dictionary converted to a string. The <code>str()</code> conversion here is important as the OpenAI API will crash if you send it all sorts of nested objects as input, but if they&#8217;re in string format it will work fine.</p>



<p>We then append the dictionary that the <code>data</code> attribute of the <code>TrainingDataEntry</code> holds to the <code>output_data</code> list. Now when we&#8217;ve called this once for every entry in the dataset we must save our data:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">with open(OUTPUT_DATA, "w", encoding="utf-8") as file:
    json.dump(output_data, file, indent=4)

print(f"Total tokens used: {TOTAL_TOKENS_USED}")</pre>



<p>This opens the <code>OUTPUT_DATA</code> file in write mode and uses the <code>json.dump</code> function to save the <code>output_data</code> list to the file in JSON format. We also specify the <code>indent</code> argument as <code>4</code> to make the JSON data more readable. Finally, we added a print statement to see how many tokens we used in the process. I&#8217;m not going to calculate the exact costs here as it would add unneeded complexity for a 1-time use script. It will be roughly like ~$0.30 if you make the full 200 calls, possibly a bit less.</p>



<h3 class="wp-block-heading">Time to sit back!</h3>



<p>And that&#8217;s it! We now have a data generator. Go ahead and run your <code>chris_gpt_dataset_generator.py</code> file and you should see a progress bar in the terminal as it generates the data. It will take a few minutes to run. You may run into a rate limit in which case <code>tenacity</code> will trigger for you and pause the process for 60 seconds before continuing. (This is also the reason we&#8217;re not using some fancy async API calls here, as we would hit the rate limit anyway and introduce pointless complexity to a script we will not run often anyway.)</p>



<p>Let it run and grab a coffee<img src="https://s.w.org/images/core/emoji/17.0.2/72x72/2615.png" alt="☕" class="wp-smiley" style="height: 1em; max-height: 1em;" /> or your favorite snack<img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f968.png" alt="🥨" class="wp-smiley" style="height: 1em; max-height: 1em;" /> while you watch that glorious progress bar creep up and relax for a moment, you&#8217;ve earned it <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f604.png" alt="😄" class="wp-smiley" style="height: 1em; max-height: 1em;" />! When it&#8217;s done you should have a new <code>Finx_completed_dataset.json</code> file in your <code>data</code> directory.</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">Generating training data: 100%|███████████████████████████████████████████████| 200/200 [08:56&lt;00:00,  2.68s/it]
Total tokens used: 224800</pre>



<p>Open it up and take a look at the data, it should be a list of dictionaries with the same structure as the <code>data</code> attribute of the <code>TrainingDataEntry</code> class:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">[
    {
        "messages": [
            {
                "role": "system",
                "content": "You are a helpful assistant that writes emails for the Finxter email newsletter, adhering perfectly to the style and tone of the Finxter brand and Chris' writing style. You will respond in the following format: {\"subject\": \"The subject of the email\", \"body\": \"The body of the email in Markdown formatting\"}."
            },
            {
                "role": "user",
                "content": "- Investment opportunities in OpenAI\n- Five proxies to invest in OpenAI:\n  - MSFT\n  - NVIDIA\n  - ARKVX\n  - META\n  - TSLA\n- Community corner featuring TradeUnafraid\n- Invitation to feature one's startup, YouTube channel, blog, or website\n- Feedback request on the email course\n- Invitation to invite a friend or colleague to the Finxter Email Academy\n- Download link for the Ultimate Python Cheat Sheet\n- Motivational quote on knowledge compounding\n- Contact information for Dr. Christian Mayer\n- Unsubscribe option"
            },
            {
                "role": "assistant",
                "content": "{'subject': '5 Proxies to Investing in OpenAI', 'body': '[![]({Link})]({Link})\\n\\nHey {User},\\n\\nTo profit from change, we need to increase ownership of disruptive trends.\\nToday\\'s article covers a question that many Finxters frequently ask:\\n\\n\ud83e\udeb4 [ **Blog** ] [How to Invest in OpenAI?]({Link}) \ud83c\udf33\\n\\nWhile it\\'s not possible to invest in OpenAI directly, the blog discusses five\\nalternatives:\\n\\n  * **MSFT** (49% stake in OpenAI),\\n  * **NVIDIA** (makes more revenue from OpenAI than any other company),\\n  * **ARKVX** ( _Anthropic!_ ),\\n  * **META** ( _Llama 2!_ ), and\\n  * **TSLA** (Optimus!).\\n\\nCheck it out if you\\'re interested in any of those! No financial advice. \ud83d\ude0a\\n\\nBe on the right side of change. \ud83d\ude80  \\nChris\\n\\n**\u2665\ufe0f Community Corner: Featured Resources**\\n\\n  * [TradeUnafraid]({Link}) is a trading platform owned and operated by Finxter community member Lee.\\n\\nDo you want to feature your own startup, YouTube channel, blog, or website as\\na [Finxter premium member]({Link})? Hit reply and let me know!\\n\\n**[How are we doing?]({Link})**  \\n[\u2b50]({Link})  \\n[\u2b50\u2b50]({Link})  \\n[\u2b50\u2b50\u2b50]({Link})  \\n  \\nTo make sure you keep getting these emails, please add _chris@finxter.com_ to\\nyour address book.  \\n  \\nI\\'d love to hear your feedback so that I can improve this free email course\\nover time. Please reply to this email and share everything on your mind!  \\n  \\n**If you find the Finxter Email Academy useful, please invite a friend or\\ncolleague! \u2764**\\n\\n  \\nHere\\'s the subscription link you can share:  \\n[https://blog.finxter.com/subscribe/]({Link})  \\n  \\nDownload the Ultimate Python Cheat Sheet here (direct PDF download): \ud83d\udc0d\\n\\n**[The Ultimate Python Cheat Sheet]({Link})**  \\n  \\nNot very motivated to learn today? Consider this:  \\n**_\"Knowledge compounds!\"_** \\\\-- Warren Buffett  \\n  \\nConsequently, if you improve your skills by 1% every day, you\\'ll 36x your\\nprogramming skills within a year!\\n\\n  \\n_**Finxter, Dr. Christian Mayer**  \\n{Address}., {City}, {Country}_\\n\\nWant out of the loop? I\\'m so sad to see you go. \ud83d\ude22 How could we have done\\nbetter?  \\nTo help future Finxters, please hit reply and tell us! \ud83e\udd17\\n\\n[Unsubscribe here]({Link})\\n\\n![]({Link})\\n\\n'}"
            }
        ]
    },
    ... many more entries ...</pre>



<p>Perfect, this has all the data we need to train our ChrisGPT model. We have the system message that is the same for all entries, the user message that is a fictional user query for the email, and the assistant&#8217;s response message that is the email itself. That&#8217;s it for part 2 of this tutorial. I&#8217;ll see you in part 3 where we will be fine-tuning our ChrisGPT model using the data we just generated <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f680.png" alt="🚀" class="wp-smiley" style="height: 1em; max-height: 1em;" />. See you there!</p>



<p class="has-base-2-background-color has-background"><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4a1.png" alt="💡" class="wp-smiley" style="height: 1em; max-height: 1em;" /> <strong>Tip</strong>: This is a full-text tutorial on how to fine-tune ChatGPT using the OpenAI API with code! For a video guide-through, check out our premium course &#8220;<a href="https://academy.finxter.com/university/fine-tuning-openai-course/">Fine-Tuning OpenAI: How to Create and Scale Your Digital Self (Example: ChrisGPT)</a>&#8220;</p>



<h2 class="wp-block-heading">Part 3: Data Validation and Training Cost</h2>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img decoding="async" width="768" height="768" src="https://blog.finxter.com/wp-content/uploads/2024/03/finxter.com_phone_photo_of_gene_editing_system_AI_posted_to_red_0535738a-b470-488d-8af6-0ddd8e4f2d58-768x768-1.webp" alt="" class="wp-image-1669885" srcset="https://blog.finxter.com/wp-content/uploads/2024/03/finxter.com_phone_photo_of_gene_editing_system_AI_posted_to_red_0535738a-b470-488d-8af6-0ddd8e4f2d58-768x768-1.webp 768w, https://blog.finxter.com/wp-content/uploads/2024/03/finxter.com_phone_photo_of_gene_editing_system_AI_posted_to_red_0535738a-b470-488d-8af6-0ddd8e4f2d58-768x768-1-300x300.webp 300w, https://blog.finxter.com/wp-content/uploads/2024/03/finxter.com_phone_photo_of_gene_editing_system_AI_posted_to_red_0535738a-b470-488d-8af6-0ddd8e4f2d58-768x768-1-150x150.webp 150w" sizes="(max-width: 768px) 100vw, 768px" /></figure>
</div>


<p>Welcome back to part 3! This is where we&#8217;re going to do the last preparation and data validation steps on our dataset and also calculate how much it&#8217;s going to cost us to train the model.</p>



<h3 class="wp-block-heading">JSONL format</h3>



<p>Remember in part 1 where we discussed the training data? We discussed the data needing to be in JSONL format. Well, it&#8217;s time to come back to that now. So what is JSONL format?</p>



<p>JSONL, or JSON Lines, is a convenient format for storing structured data that may be processed one record at a time. Each line in a JSONL file is a valid JSON object. This is different from a regular JSON file, where the entire file is a single JSON object or array.</p>



<p>Each line is a separate, independent JSON object. This means that a large file can be read into memory one line at a time, instead of needing to read the entire data into memory at once, which can be a significant advantage when working with very large datasets. This makes it very useful for streaming JSON data object by object through another process like training an LLM model!</p>



<p>So say we have an object that looks like this:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">[
  {
    "employee": {
      "name": "John Doe",
      "age": 30,
      "department": "Sales",
      "address": {
        "street": "123 Main St",
        "city": "Springfield",
        "state": "IL",
        "zip": "62701"
      }
    }
  },
  {
    "employee": {
      "name": "Jane Smith",
      "age": 28,
      "department": "Marketing",
      "address": {
        "street": "456 Elm St",
        "city": "Springfield",
        "state": "IL",
        "zip": "62701"
      }
    }
  },
  {
    "employee": {
      "name": "Joe Schmoe",
      "age": 35,
      "department": "Engineering",
      "address": {
        "street": "789 Oak St",
        "city": "Springfield",
        "state": "IL",
        "zip": "62701"
      }
    }
  }
]</pre>



<p>Then the JSONL version is essentially just a flattened-down version of this, with each object on a single line. Note that we can remove the brackets and the commas between different objects, as it is a given that each line contains one JSON object in this format:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">{"name": "John Doe", "age": 30, "department": "Sales", "address": {"street": "123 Main St", "city": "Springfield", "state": "IL", "zip": "62701"}}
{"name": "Jane Smith", "age": 28, "department": "Marketing", "address": {"street": "456 Elm St", "city": "Springfield", "state": "IL", "zip": "62701"}}
{"name": "Joe Schmoe", "age": 35, "department": "Engineering", "address": {"street": "789 Oak St", "city": "Springfield", "state": "IL", "zip": "62701"}}</pre>



<p>You will probably see the objects wrap around, but this is only a visual thing. In the actual file, each object is on a single line.</p>



<h3 class="wp-block-heading">Flattening our dataset into a JSONL file</h3>



<p>So let&#8217;s create a utility function to flatten our dataset into a JSONL file. In your existing utils folder, make a new file called <code>jsonl.py</code>:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />Finx_Fine_Tuning
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />data
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />Finx_completed_dataset.json
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />Finx_dataset.json
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />utils
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />html_email.py
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />jsonl.py          (new file)
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />.env
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />constants.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />chris_gpt_dataset_generator.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />Pipfile
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />Pipfile.lock</pre>



<p>In <code>jsonl.py</code>, add the following imports to get started:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">import json
from pathlib import Path
from typing import Iterable</pre>



<p>We import the <code>json</code> module to read and save JSON data. We import <code>Path</code> and <code>Iterable</code> only to use them as type hints, to make sure our code is as clear and readable as possible. First, let&#8217;s make the problem smaller by creating a function that takes a list or iterable of dictionaries, and converts them into a JSONL file. Add the following function to <code>jsonl.py</code>:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">def dicts_to_jsonl(output_file: Path, data: Iterable[dict]) -> Path:
    with open(output_file, "w") as file:
        for dict_obj in data:
            json_string = json.dumps(dict_obj)
            file.write(json_string + "\n")
    return output_file</pre>



<p>This function takes two arguments: <code>output_file</code> is the path to the file we want to write, and <code>data</code> is an iterable of dictionaries. We open the file in write mode, and then loop through each dictionary in the iterable. We convert each dictionary to a JSON string using <code>json.dumps</code>, and then write it to the file. We add a newline character at the end of each line to separate the JSON objects. Finally, we return the path to the file as a <code>Path</code> object.</p>



<p>Ok, that handles directly converting a list of dictionaries stored in memory, now let&#8217;s add a second function below that will handle converting an existing JSON file into a JSONL file. Add the following function to <code>jsonl.py</code>:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">def json_to_jsonl(input_file: Path, output_file: Path) -> Path:
    with open(input_file, "r") as in_file:
        data = json.load(in_file)

    return dicts_to_jsonl(output_file, data)</pre>



<p>This function takes two arguments: <code>input_file</code> is the path to the JSON file we want to read, and <code>output_file</code> is the path to the JSONL file we want to write. We open the input file in read mode, and then load the JSON data into memory using <code>json.load</code>. We then call the <code>dicts_to_jsonl</code> function we defined earlier to write the data to the output file.</p>



<p>Using this composition, we now have two functions, one for converting dictionaries, and another for an existing JSON file, yet we did not duplicate any code. Go ahead and save and close <code>jsonl.py</code></p>



<h3 class="wp-block-heading">Validating our dataset</h3>



<p>Before we train our model, we need to make sure our dataset is in the right format and we&#8217;ll also check how much this is going to cost, and make sure none of the entries exceed the token limit. This may all seem a bit overkill, but you really don&#8217;t want to start training a model and have it fail halfway due to sloppy data or a single entry that is too long. It&#8217;s also considerably more expensive than other ways of using ChatGPT because we&#8217;re creating a whole custom model, so it&#8217;s nice to know ahead of time exactly how much money you&#8217;re going to spend.</p>



<p>We&#8217;re writing most of these specific things in utility functions in separate files, so you can reuse all of these for your future fine-tuning projects. We&#8217;ll do the same for the validation and price-calculator logic. In your existing utils folder, make a new file called <code>data_validation.py</code>:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />Finx_Fine_Tuning
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />data
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />Finx_completed_dataset.json
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />Finx_dataset.json
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />utils
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />data_validation.py          (new file)
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />html_email.py
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />jsonl.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />.env
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />constants.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />chris_gpt_dataset_generator.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />Pipfile
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />Pipfile.lock</pre>



<p>Time to install the <code>tiktoken</code> library before we start writing the code. Open your terminal and run the following command:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">pipenv install tiktoken==0.6.0</pre>



<p>The <code>tiktoken</code> library is a Python package developed by OpenAI. We&#8217;ll use it to count the number of tokens in a text string without making any API calls.</p>



<p>In <code>data_validation.py</code>, get started by adding the following imports:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">import json
from decimal import Decimal
from pathlib import Path

import tiktoken</pre>



<p>Most of these are familiar by now, but we also import <code>Decimal</code> from the <code>decimal</code> module. We&#8217;ll use this to handle the cost calculations, as it&#8217;s more precise than using floating point numbers, not giving us the annoying rounding errors to deal with.</p>



<p>Now define a constant that will be used for our calculations:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">TRAINING_COST_PER_1000_TOKENS = Decimal("0.0080")</pre>



<p>This is the cost per 1000 tokens for training data at the time of writing, but it may have changed if you&#8217;re watching this tutorial in the future. You can check the current cost on the <a href="https://openai.com/pricing">OpenAI pricing page</a> and adjust this number accordingly.</p>



<h3 class="wp-block-heading">Creating the Validator class</h3>



<p>Now let&#8217;s create our <code>Validator</code>. As we&#8217;ll have a lot of related functions, let&#8217;s use a class to group them together and start with the <code>__init__</code> method:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">class Validator:
    def __init__(self, jsonl_file: Path) -> None:
        self.data = self._load_data(jsonl_file)
        self._token_list = None
        self.encoding = tiktoken.encoding_for_model("gpt-3.5-turbo")</pre>



<p>So the <code>__init__</code> method will get called when we instantiate a new instance of this class, and it will take a <code>Path</code> argument to the JSONL file we want to validate. We&#8217;ll load the data from the file and store it in the <code>data</code> attribute using the <code>_load_data</code> method we&#8217;ll define next.</p>



<p>We&#8217;ll also initialize the <code>_token_list</code> attribute to <code>None</code> for now, and we&#8217;ll use it to store the token count for each entry in the dataset. Finally we store the encoding for the model we&#8217;re going to use in the <code>encoding</code> attribute. As the <code>tiktoken</code> library was also made by OpenAI, it has a handy method to let us load up the proper encoding for the model we&#8217;re going to use.</p>



<p>Now let&#8217;s add the <code>_load_data</code> method. As our data file is not that massive, we&#8217;ll just load up the whole file at once and not worry about loading the JSONL one line at a time:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">class Validator:
    def __init__():
        ...

    def _load_data(self, jsonl_file: Path) -> list:
        with open(jsonl_file, "r", encoding="utf-8") as file:
            data = [json.loads(line) for line in file]
        return data</pre>



<p>No big surprises here, we take the path as input and return a list. The only different thing is since the data is in JSONL format, we use a list comprehension. For each line in the fine, we call <code>json.loads</code> to convert the JSON string to a Python dictionary, which will then become an element in the list saved as the variable <code>data</code>.</p>



<p>Now let&#8217;s add a method to calculate the token count for each entry in the dataset:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">class Validator:
    def __init__():
        ...

    def _load_data():
        ...

    def _calculate_token_amount_per_entry(self) -> list[int]:
        extra_tokens_per_message = 2
        token_list = []
        for training_data_object in self.data:
            num_tokens = 0
            for message in training_data_object["messages"]:
                for _, value in message.items():
                    num_tokens += len(self.encoding.encode(str(value)))
                    num_tokens += extra_tokens_per_message
            token_list.append(num_tokens)
        return token_list</pre>



<p>This method will return the approximate amount of tokens as a list of integers. We start by defining a variable <code>extra_tokens_per_message</code> and set it to 2. This is the current number of extra tokens we need to add to account for the object structure besides just the strings themselves to come to an accurate number. We then loop through each training_data_object in the dataset and set a counter <code>num_tokens</code> to 0.</p>



<p>As this is ChatCompletion data, we know that the messages are stored in a list under the key &#8220;messages&#8221;. We loop through each message and then through each key-value pair in the message. (We use an <code>_</code> for the key because we don&#8217;t need it in this case, but we need to use it as a placeholder to unpack the tuple.)</p>



<p>We call self.encoding.encode to encode the value to a list of tokens, and then add the length of this list to <code>num_tokens</code>, as it&#8217;s only the <code>len</code> or length that we are interested in. We then add the <code>extra_tokens_per_message</code> to account for the object structure as discussed, as this also takes up tokens.</p>



<p>After all the key-value pairs inside each index of the messages inside a <code>training_data_object</code> are processed, we append the <code>num_tokens</code> to the <code>token_list</code> and then move on to the next <code>training_data_object</code> in the list.</p>



<p>Now let&#8217;s add a function to check if our formatting has any mistakes in it:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">class Validator:
    def __init__():
        ...

    def _load_data():
        ...

    def _calculate_token_amount_per_entry():
        ...

    def _check_single_entry_format(self, entry) -> bool:
        if not isinstance(entry, dict):
            return False

        if list(entry.keys()) != ["messages"]:
            return False

        messages = entry.get("messages", [])

        return all(
            isinstance(message, dict) and "role" in message and "content" in message
            for message in messages
        )</pre>



<p>This function will return <code>True</code> if the entry is in the correct format, and <code>False</code> if it&#8217;s not. It will check a single entry, or <code>training_data_object</code>, in our dataset at a time. First, it will check if the entry is a dictionary. After that, we call <code>keys()</code> on the entry to get the dictionary keys and call <code>list()</code> on it to convert it to a list. We then check if the list is equal to <code>["messages"]</code>, so make sure it has one key and only one, and that key is &#8220;messages&#8221;.</p>



<p>We then call the <code>get()</code> method on the entry to get the value of the &#8220;messages&#8221; key. Now the last line uses a generator expression and might look confusing if you&#8217;re not familiar with it, so let&#8217;s break it down step by step.</p>



<p>A generator expression is similar to a list comprehension, but it doesn&#8217;t store the list in memory. Instead, it generates each value on the fly as you iterate over it. This can be more memory-efficient than a list comprehension for large sequences, though it doesn&#8217;t matter much for our dataset size here. The generator expression in the code is:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">(message for message in messages)</pre>



<p>This generates a sequence of message values, one for each message in messages.</p>



<p>The <code>isinstance(message, dict) and "role" in message and "content" in message</code> part is a condition that checks whether each message is a dictionary and whether it contains the keys <code>role</code> and <code>content</code>.</p>



<p>The <code>all()</code> function takes an iterable (in this case, the generator expression) and returns <code>True</code> if all elements of the iterable are truthy (i.e., they evaluate to <code>True</code>), and <code>False</code> if even a single entry is not <code>True</code>. So, in simple terms, we check whether all messages in the messages list are dictionaries that contain the keys <code>role</code> and <code>content</code>, and return either <code>True</code> or <code>False</code>.</p>



<p>Now, let&#8217;s add a property to get the <code>token_list</code>, so we can easily access it:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">class Validator:
    def __init__():
        ...

    def _load_data():
        ...

    def _calculate_token_amount_per_entry():
        ...

    def _check_single_entry_format():
        ...

    @property
    def token_list(self) -> list[int]:
        if self._token_list is None:
            self._token_list = self._calculate_token_amount_per_entry()
        return self._token_list</pre>



<p>The <code>@property</code> decorator here means that we can access this particular method as a property, so using <code>self.token_list</code> instead of calling it as a method with <code>self.token_list()</code>. First, this will check if <code>self._token list</code> is None, which it will be the first time we access it. If it is, it will call the <code>_calculate_token_amount_per_entry</code> method to calculate the token list and store it in the <code>self._token_list</code> attribute. Then it will return the <code>_token_list</code> attribute. If this method is called again, it will just return the <code>_token_list</code> attribute without recalculating it as it&#8217;s already been calculated and stored.</p>



<p>Note that the methods with the <code>_</code> prefix are meant to be private, so the <code>_token_list</code> is our implementation detail here, and the <code>token_list</code> property is the public interface to access it. This is a good practice because it ensures that <code>_token_list</code> is always in a valid state when it&#8217;s accessed, and it hides the details of how <code>_token_list</code> is implemented and managed from the rest of your program by providing <code>token_list</code> as an access point.</p>



<p>Now let&#8217;s add a method to check if the dataset is valid:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">class Validator:
    def __init__():
        ...

    def _load_data():
        ...

    def _calculate_token_amount_per_entry():
        ...

    def _check_single_entry_format():
        ...

    @property
    def token_list():
        ...

    def validate_data(self) -> bool:
        return all(self._check_single_entry_format(entry) for entry in self.data)</pre>



<p>This method will return <code>True</code> if all entries in the dataset are in the correct format, and <code>False</code> if any of them are not. It uses a generator expression in the same style as we did before. Note that it will stop checking as soon as it finds an entry that fails the _check_single_entry_format test, because <code>all</code> stops iterating as soon as it encounters a False value.</p>



<p>Now let&#8217;s add a to get the training cost in dollars:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">class Validator:
    def __init__():
        ...

    def _load_data():
        ...

    def _calculate_token_amount_per_entry():
        ...

    def _check_single_entry_format():
        ...

    @property
    def token_list():
        ...

    def validate_data():
        ...

    def get_training_cost_in_dollars(self, epochs: int = 3) -> Decimal:
        total_tokens = sum(self.token_list)
        total_cost_dollars = (
            TRAINING_COST_PER_1000_TOKENS * total_tokens / 1000 * epochs
        )
        print(
            f"Total estimated cost: ~${total_cost_dollars:.3f} for training {epochs} epochs on {total_tokens} token dataset."
        )
        return total_cost_dollars</pre>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4a1.png" alt="💡" class="wp-smiley" style="height: 1em; max-height: 1em;" /> Machine-learning Top-tip <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4a1.png" alt="💡" class="wp-smiley" style="height: 1em; max-height: 1em;" />
Epochs are the number of times the model will go through the entire dataset during training. The more epochs, the more the model will learn and internalize our dataset. If the number is too low, it will not fully internalize our training data, but if the number is too high it will internalize our specific examples too much and lose its ability to generalize, a concept called overfitting. 3 Epochs is a good starting point for most fine-tuning tasks.</pre>



<p>This method will return the total cost in dollars for training the model for a given number of epochs as a <code>Decimal</code> type object. It uses the <code>sum</code> function to calculate the total number of tokens in the dataset and then does simple math to get the total cost in dollars. We print the total cost with an accuracy of 3 decimal places by using the <code>:.3f</code> format specifier in the f-string and then return the total cost.</p>



<p>One last method and we&#8217;ll be done, I promise! <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f604.png" alt="😄" class="wp-smiley" style="height: 1em; max-height: 1em;" /> We want to be able to make sure the longest entry is not above our token limit:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">class Validator:
    def __init__():
        ...

    def _load_data():
        ...

    def _calculate_token_amount_per_entry():
        ...

    def _check_single_entry_format():
        ...

    @property
    def token_list():
        ...

    def validate_data():
        ...

    def get_training_cost_in_dollars():
        ...

    def longest_entry_token_count(self) -> int:
        return max(self.token_list)</pre>



<p>We use the <code>max</code> function to get the maximum value from the <code>token_list</code> and return it. Token limits per training example, so for every line in our JSONL file, are the same as the context limit for the ChatGPT model we&#8217;re using. For <code>gpt-3.5-turbo-1106</code>, the maximum context length is 16,385 tokens, so as long as this number is below that, you&#8217;ll know you&#8217;re safe.</p>



<p>Here is the whole class again for reference:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">class Validator:
    def __init__(self, jsonl_file: Path) -> None:
        self.data = self._load_data(jsonl_file)
        self._token_list = None
        self.encoding = tiktoken.encoding_for_model("gpt-3.5-turbo")

    def _load_data(self, jsonl_file: Path) -> list:
        with open(jsonl_file, "r", encoding="utf-8") as file:
            data = [json.loads(line) for line in file]
        return data

    def _calculate_token_amount_per_entry(self) -> list[int]:
        extra_tokens_per_message = 2
        token_list = []
        for training_data_object in self.data:
            num_tokens = 0
            for message in training_data_object["messages"]:
                for _, value in message.items():
                    num_tokens += len(self.encoding.encode(str(value)))
                    num_tokens += extra_tokens_per_message
            token_list.append(num_tokens)
        return token_list

    def _check_single_entry_format(self, entry) -> bool:
        if not isinstance(entry, dict):
            return False

        if list(entry.keys()) != ["messages"]:
            return False

        messages = entry.get("messages", [])

        return all(
            isinstance(message, dict) and "role" in message and "content" in message
            for message in messages
        )

    @property
    def token_list(self) -> list[int]:
        if self._token_list is None:
            self._token_list = self._calculate_token_amount_per_entry()
        return self._token_list

    def validate_data(self) -> bool:
        return all(self._check_single_entry_format(entry) for entry in self.data)

    def get_training_cost_in_dollars(self, epochs: int = 3) -> Decimal:
        total_tokens = sum(self.token_list)
        total_cost_dollars = (
            TRAINING_COST_PER_1000_TOKENS * total_tokens / 1000 * epochs
        )
        print(
            f"Total estimated cost: ~${total_cost_dollars:.3f} for training {epochs} epochs on {total_tokens} token dataset."
        )
        return total_cost_dollars

    def longest_entry_token_count(self) -> int:
        return max(self.token_list)</pre>



<h3 class="wp-block-heading">Using the Validator</h3>



<p>So give yourself a pat on the back for that <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f60e.png" alt="😎" class="wp-smiley" style="height: 1em; max-height: 1em;" />. Now let&#8217;s train us some ChrisGPT! Save and close this file, then create a new file in your root directory named <code>chris_gpt_dataset_validation.py</code>:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />Finx_Fine_Tuning
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />data
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />Finx_completed_dataset.json
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />Finx_dataset.json
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />utils
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />data_validation.py
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />html_email.py
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />jsonl.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />.env
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />constants.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />chris_gpt_dataset_generator.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />chris_gpt_dataset_validation.py          (new file)
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />Pipfile
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />Pipfile.lock</pre>



<p>In <code>chris_gpt_dataset_validation.py</code>, add the following setup to get started:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">from utils import data_validation, jsonl
from constants import DATA_DIRECTORY


JSON_FILE = DATA_DIRECTORY / "Finx_completed_dataset.json"
JSONL_FILE = DATA_DIRECTORY / "Finx_completed_dataset.jsonl"</pre>



<p>We import all the stuff we made and prepared ourselves, and then we define the paths to the existing JSON file and the JSONL file we want to create. Now let&#8217;s make some good use of all the hard work we&#8217;ve done so far:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">jsonl.json_to_jsonl(JSON_FILE, JSONL_FILE)  # Only run once

data_validator = data_validation.Validator(JSONL_FILE)

print(f"Data valid: {data_validator.validate_data()}")
data_validator.get_training_cost_in_dollars()
print(f"Longest entry: {data_validator.longest_entry_token_count()} tokens")</pre>



<p>We convert our JSON file to a JSONL file with the same name. It says &#8220;Only run once&#8221; so you can comment out the code after we run the file the first time. Nothing bad will happen if you don&#8217;t though, it just does some unneeded calculations to make the same file again.</p>



<p>Then we create a new instance of our <code>Validator</code> class and pass the path to the JSONL file as an argument. We call the <code>validate_data</code> method to check if the dataset is valid and print the result. We then call the <code>get_training_cost_in_dollars</code> method to get the estimated training cost, which will get printed to the console automatically, and finally, we call the <code>longest_entry_token_count</code> method to get the token count of the longest entry in the dataset so we can make sure we don&#8217;t exceed the token limit.</p>



<p>Let&#8217;s run the file we have so far just as an interim test. You should get an output in your terminal that looks something like this:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">Data valid: True
Total estimated cost: ~$5.184 for training 3 epochs on 216000 token dataset.
Longest entry: 2441 tokens</pre>



<p>Your numbers will be slightly different from mine, as the data is partly LLM generated, but it will be very close to this. We can see our data is valid, we have over 200,000 tokens in total, and the longest entry is 2441 tokens, which is well below the 16,385 token limit for the <code>gpt-3.5-turbo-1106</code> model.</p>



<p>You&#8217;ll also notice that a JSONL file has been created in your data directory with the training data in JSONL format:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />Finx_Fine_Tuning
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />data
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />Finx_completed_dataset.json
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />Finx_completed_dataset.jsonl <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/2728.png" alt="✨" class="wp-smiley" style="height: 1em; max-height: 1em;" />
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />Finx_dataset.json
    ...</pre>



<p>Now you might be surprised by the cost here. While $5 is not a massive amount of money it is a whole lot more than we typically consume when making regular ChatGPT calls. This is the reason we took so much time on the data validation, to make sure we get the data right the first time, and to know the exact cost before we commit to the training.</p>



<p>For those $5 you get something pretty damn cool though, your own custom ChatGPT <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f60e.png" alt="😎" class="wp-smiley" style="height: 1em; max-height: 1em;" />. That being said, I understand if you&#8217;re not willing to spend $5 on this simple test project. You can run with half the training data, which is 100 examples, or even a quarter, which is 50 examples. But your output will not be as good as mine if you do so.</p>



<h3 class="wp-block-heading">Limiting the dataset size</h3>



<p>Let&#8217;s make some small changes to the code so you can limit your dataset size if you want to:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">import json

from constants import DATA_DIRECTORY
from utils import data_validation, jsonl


JSON_FILE = DATA_DIRECTORY / "Finx_completed_dataset.json"
JSONL_FILE = DATA_DIRECTORY / "Finx_completed_dataset.jsonl"
LIMIT = 100


with open(JSON_FILE, "r", encoding="utf-8") as in_file:
    data = json.load(in_file)
    jsonl.dicts_to_jsonl(JSONL_FILE, data[:LIMIT])

data_validator = data_validation.Validator(JSONL_FILE)

print(f"Data valid: {data_validator.validate_data()}")
data_validator.get_training_cost_in_dollars()
print(f"Longest entry: {data_validator.longest_entry_token_count()} tokens")</pre>



<p>We added an import for <code>json</code>, and we set a constant named <code>LIMIT</code>. We then simply manually load the data from the <code>JSON_FILE</code> and use the <code>dicts_to_jsonl</code> function instead of the <code>json_to_jsonl</code> function, passing in only the first <code>LIMIT</code> number of examples using a simple slice. Note how easy this is as we made the <code>jsonl</code> utility module out of pieces so we can simply use a different piece this time.</p>



<p>I&#8217;m going to set the <code>LIMIT</code> variable to <code>None</code> as I want to use the full 200 examples for mine. Choose whatever number you want to use for the <code>LIMIT</code>, and then run the file again. It will create the new <code>JSONL_FILE</code> with the limited number of examples, and then validate and tell you the new cost. Limiting to 100 examples will cost you around $2.55.</p>



<p>Now that we know the cost, and we know our data is valid, we can move on to the next part where we&#8217;ll actually train our model on the JSONL data. I&#8217;ll see you there! <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f680.png" alt="🚀" class="wp-smiley" style="height: 1em; max-height: 1em;" /></p>



<p class="has-base-2-background-color has-background"><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4a1.png" alt="💡" class="wp-smiley" style="height: 1em; max-height: 1em;" /> <strong>Tip</strong>: This is a full-text tutorial on how to fine-tune ChatGPT using the OpenAI API with code! For a video guide-through, check out our premium course &#8220;<a href="https://academy.finxter.com/university/fine-tuning-openai-course/">Fine-Tuning OpenAI: How to Create and Scale Your Digital Self (Example: ChrisGPT)</a>&#8220;</p>



<h2 class="wp-block-heading">Part 4: Training and Running ChrisGPT</h2>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="604" height="871" src="https://blog.finxter.com/wp-content/uploads/2024/03/image-44.png" alt="" class="wp-image-1669906" srcset="https://blog.finxter.com/wp-content/uploads/2024/03/image-44.png 604w, https://blog.finxter.com/wp-content/uploads/2024/03/image-44-208x300.png 208w" sizes="auto, (max-width: 604px) 100vw, 604px" /></figure>
</div>


<p>Hi and welcome back to part 4, where we&#8217;ll be training and running ChrisGPT. In this part, we&#8217;ll finally be using the OpenAI fine-tuning API endpoints, which are fairly simple!</p>



<p>There are two ways to use the fine-tuning API, both of which are very simple. The first way to do this is programmatically, using Python code just like we do when sending normal calls to ChatGPT. We&#8217;ll be looking at this first. The second way is to use the web interface for the fine-tuning API.</p>



<h3 class="wp-block-heading">Using the fine-tuning API programmatically</h3>



<p>Go ahead and create a new file called <code>chris_gpt_training.py</code> in the root directory of your project:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />Finx_Fine_Tuning
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />data
        ...
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />utils
        ...
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />.env
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />chris_gpt_dataset_generator.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />chris_gpt_dataset_validation.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />chris_gpt_training.py <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/2728.png" alt="✨" class="wp-smiley" style="height: 1em; max-height: 1em;" />
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />constants.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />Pipfile
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />Pipfile.lock</pre>



<p>So let&#8217;s start with our imports and basic setup:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">from constants import CLIENT, DATA_DIRECTORY


JSONL_FILE = DATA_DIRECTORY / "Finx_completed_dataset.jsonl"
MODEL = "gpt-3.5-turbo-1106"
SUFFIX = "chris_gpt"</pre>



<p>We import the OpenAI Client we stored in <code>CLIENT</code> and <code>DATA_DIRECTORY</code>. Then we quickly set up a path to the JSONL data for training (make sure you don&#8217;t accidentally use the <code>json</code> instead of <code>jsonl</code> one). We also set the model to the <code>1106</code> version as this is the newest one that has fine-tuning. My testing showed <code>1106</code> to be significantly better for fine-tuning than the older <code>0613</code> version.</p>



<p>Finally, the <code>SUFFIX</code> part will allow us to choose a part of the fine-tuning model&#8217;s name ourselves. This <code>suffix</code> will become part of the name you use to call your model, which is quite useful for identification as the model names are a bit long and all extremely similar if you don&#8217;t have a suffix. An example:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""># Example model name without suffix
ft:gpt-3.5-turbo-1106:personal::8ot8ZLJR

# Example model name with suffix
ft:gpt-3.5-turbo-1106:personal:chris-gpt:8ot8ZLJR</pre>



<p>Now let&#8217;s have a look at the file related methods in the <code>openai</code> client. Make sure you don&#8217;t run the file yet:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""># File related methods
file = CLIENT.files.create(file=open(JSONL_FILE, "rb"), purpose="fine-tune")

CLIENT.files.list(purpose="fine-tune")

CLIENT.files.delete(file.id)</pre>



<p>These methods are all fairly self-explanatory. The create method will upload your file, make sure to use the <code>rb</code> (read-binary) mode for the file and provide the purpose as <code>fine-tune</code> so that the OpenAI servers know what this file is for. This returns an object we catch in the <code>file</code> variable. It looks like this and contains some basic file data, most importantly the <code>id</code>:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""># Example of a file object

FileObject(
    id="file-DamWAnhgpnRvEyMZ3dOdHpvG",
    bytes=865053,
    created_at=1708303339,
    filename="Finx_completed_dataset.jsonl",
    object="file",
    purpose="fine-tune",
    status="processed",
    status_details=None,
)</pre>



<p>The <code>list</code> method will list all files, allowing us to filter on a purpose, so we&#8217;ll filter for files with the purpose of <code>fine-tune</code> here. It just returns a list of the <code>FileObject</code> objects you see above. The <code>delete</code> method will delete a file by its <code>id</code>, using whatever ID you pass in to delete.</p>



<h3 class="wp-block-heading">Fine-tuning methods</h3>



<p>Now let&#8217;s take a look at the fine-tuning-job related methods. Again, don&#8217;t run this file yet, let&#8217;s just have a quick look first:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""># Fine-tuning-job related methods
fine_tuning_job = CLIENT.fine_tuning.jobs.create(
    model=MODEL,
    training_file=file.id,
    hyperparameters={"n_epochs": 3},
    suffix=SUFFIX,
)

CLIENT.fine_tuning.jobs.list()

CLIENT.fine_tuning.jobs.retrieve(fine_tuning_job.id)

CLIENT.fine_tuning.jobs.cancel(fine_tuning_job.id)</pre>



<p>The most important method is of course the <code>create</code> method, which will create a fine-tuning job. You pass in the <code>model</code> and the <code>file.id</code> of the file you want to do the training on. You can optionally pass in <code>hyperparameters</code> such as the number of <code>epochs</code> we discussed earlier, and the <code>suffix</code> we talked about. This method returns a <code>FineTuningJob</code> object, which looks like this:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">FineTuningJob(
    id="ftjob-1OATxnQAgdY4yjPNmSBai95f",
    created_at=1708318423,
    error=Error(code=None, message=None, param=None, error=None),
    fine_tuned_model=None,
    finished_at=None,
    hyperparameters=Hyperparameters(
        n_epochs=3, batch_size="auto", learning_rate_multiplier="auto"
    ),
    model="gpt-3.5-turbo-1106",
    object="fine_tuning.job",
    organization_id="org-oMYMXpp7Cr9pG1rG5Z8a1T2w",
    result_files=[],
    status="validating_files",
    trained_tokens=None,
    training_file="file-EX13iLyISBZcreRCH3Fm51Pn",
    validation_file=None,
)</pre>



<p>We can see that the <code>FineTuningJob</code> object also has an ID that we can use to refer to it, and some basic info, such as the fact that this one has not finished yet. We can see the <code>hyperparameters</code>, including the <code>batch_size</code> and <code>learning_rate_multiplier</code> which are set to <code>auto</code> by default.</p>



<p>The <code>batch_size</code> is the number of examples in each training batch. Batch size is the amount of data the model looks at before it learns something new. It&#8217;s like reading a few pages of a book, and then stopping to think about what you&#8217;ve read before continuing. At the end of the batch, the predictions are compared to the desired output, and the error is calculated and used to update the model. We&#8217;ll just leave this set to <code>auto</code>.</p>



<p>The <code>learning_rate_multiplier</code> is a value that multiplies the learning rate of the model. The learning rate is a hyperparameter that controls how much to change the model in response to the estimated error each time the model weights are updated. If this is set to high the model may very easily <code>overfit</code> to the specific data it has been trained on. We&#8217;ll also leave this set to <code>auto</code>.</p>



<p>The <code>list</code> method, again, simply returns a list of all the <code>FineTuningJob</code> objects for your account. The <code>retrieve</code> method will return a single <code>FineTuningJob</code> object by its <code>id</code> and the <code>cancel</code> method will cancel a fine-tuning job by its <code>id</code>. These methods are extremely simple and there really is nothing more to them.</p>



<p>Again, don&#8217;t run this fine yet. The last method we have is for deleting a completed fine-tuned model:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""># Fine-tuned-model related methods
CLIENT.models.delete("model_id_here")</pre>



<p>For this, you need to have a fully trained model to delete obviously, and then just pass in the ID. Before we actually go ahead and run this file, I&#8217;ll comment out several of the methods to leave them in for our reference, as we naturally don&#8217;t want to delete the file we just uploaded nor cancel the fine-tuning-job:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">from constants import CLIENT, DATA_DIRECTORY


JSONL_FILE = DATA_DIRECTORY / "Finx_completed_dataset.jsonl"
MODEL = "gpt-3.5-turbo-1106"
SUFFIX = "chris_gpt"


# File related methods
file = CLIENT.files.create(file=open(JSONL_FILE, "rb"), purpose="fine-tune")

print(CLIENT.files.list(purpose="fine-tune"))

# CLIENT.files.delete(file.id)


# Fine-tuning-job related methods
fine_tuning_job = CLIENT.fine_tuning.jobs.create(
    model=MODEL,
    training_file=file.id,
    hyperparameters={"n_epochs": 3},
    suffix=SUFFIX,
)

# CLIENT.fine_tuning.jobs.list()

print(CLIENT.fine_tuning.jobs.retrieve(fine_tuning_job.id))

# CLIENT.fine_tuning.jobs.cancel(fine_tuning_job.id)


# Fine-tuned-model related methods
# CLIENT.models.delete("model_id_here")</pre>



<p>I&#8217;ve gone ahead and added <code>print</code> statements around the <code>files.list</code> and the <code>fine_tuning.jobs.retrieve</code> calls so we can see the output in our terminal. Now go ahead and run this file to start your fine-tuning job!</p>



<p>You should see the <code>FineTuningJob</code> object printed to the console. From here on we&#8217;ll be switching to the web interface for the fine-tuning API, as it&#8217;s much easier to work with and has real-time progress.</p>



<h3 class="wp-block-heading">Using the web interface for the fine-tuning API</h3>



<p>As developers we sometimes think we need to do everything programmatically, but there really is no need if we have an easy web interface that is much simpler to use. There really is no point in coding up some complex programmatic solution for something you&#8217;re only going to be doing once in a while.</p>



<p>If you have a look at https://platform.openai.com/files, you will see the web interface for managing the files you uploaded to OpenAI:</p>



<p>File API:</p>



<figure class="wp-block-image size-large"><img decoding="async" src="https://academy.finxter.com/wp-content/uploads/2024/03/4-1-1024x659.png" alt="" class="wp-image-4283"/></figure>



<p>This interface is much nicer to read than the long file object list in your terminal window, and you can also upload new files here directly.</p>



<p>Now let&#8217;s switch to the fine-tuning tab at https://platform.openai.com/finetune, to see the fine-tuning job that we have started:</p>



<p>Fine-tuning jobs API:</p>



<figure class="wp-block-image size-large"><img decoding="async" src="https://academy.finxter.com/wp-content/uploads/2024/03/4-2-1024x659.png" alt="" class="wp-image-4284"/></figure>



<p>As you can see I&#8217;ve done quite some testing so I have a whole load of models here, but you should see your model either &#8220;Validating files…&#8221; or maybe already in the training stage. You can also create new fine-tuning jobs on this page or cancel a running fine-tuning job using the cancel button. We can also see a lot of details on our fine-tuning job as it progresses:</p>



<p>Fine-tuning details:</p>



<figure class="wp-block-image size-large"><img decoding="async" src="https://academy.finxter.com/wp-content/uploads/2024/03/4-3-924x1024.png" alt="" class="wp-image-4285"/></figure>



<p>Please ignore all the &#8220;Cancelled&#8221; fine-tunes in my list, I tend to double-check the code when writing these things so I triggered a lot of fine-tunes that were not needed and therefore canceled them straight after. The important thing here is first of all, when the model is done training, you will see the full model name for your fine-tune. In this case, I have <code>ft:gpt-3.5-turbo-1106:personal:chris-gpt-full:8ot8ZLJR</code> as my model name here, but yours is probably not done training yet.</p>



<p>It can take anywhere from a couple of minutes to potentially even hours for OpenAI to train your model, depending on the size of the data and how busy the servers are. Mine typically finished within 10 to 30 minutes, and OpenAI will send you an email when the model training is done. In the meantime, you can see the progress update in real-time in the right-side panel here, where you can see the training loss go down in real-time. You can also see the messages at the bottom keeping you posted on the progress.</p>



<p>So go ahead, it&#8217;s time again for your well-deserved coffee break<img src="https://s.w.org/images/core/emoji/17.0.2/72x72/2615.png" alt="☕" class="wp-smiley" style="height: 1em; max-height: 1em;" /> with your favorite snack<img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f968.png" alt="🥨" class="wp-smiley" style="height: 1em; max-height: 1em;" />. Have a small break and give your model time to train, and I&#8217;ll see you back here when it&#8217;s done!</p>



<h3 class="wp-block-heading">Running our own fine-tuned ChrisGPT model!</h3>



<p>Ok, so I assume you&#8217;re back and your model is done training, which means you now have the name of your personal fine-tuned model from the https://platform.openai.com/finetune fine-tuning page. The name will also be in the email you receive when the training is done. First, go and open the <code>constants.py</code> file in your root directory, as we want to make some updates to the file:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">from pathlib import Path

from decouple import config
from openai import OpenAI


CLIENT = OpenAI(api_key=str(config("OPENAI_API_KEY")))
DATA_DIRECTORY = Path(__file__).parent / "data"
## Add the below variables
OUTPUT_DIRECTORY = Path(__file__).parent / "output"
CHRIS_GPT = "ft:gpt-3.5-turbo-1106:personal:chris-gpt-full:8ot8ZLJR"</pre>



<p>We added an output directory for our generated files, and I stored the name for my ChrisGPT model in the <code>CHRIS_GPT</code> variable. Make sure you replace the <code>CHRIS_GPT</code> string with the name of your own model and do not copy mine, as you will not be able to access my personal model. Make sure you save and close this file.</p>



<p>Now create the empty folder for the <code>output</code>:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />Finx_Fine_Tuning
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />data
        ...
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />output <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/2728.png" alt="✨" class="wp-smiley" style="height: 1em; max-height: 1em;" />
        (empty)
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />utils
        ...
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />.env
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />chris_gpt_dataset_generator.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />chris_gpt_dataset_validation.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />chris_gpt_training.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />constants.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />Pipfile
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />Pipfile.lock</pre>



<p>And then create a new file in your root directory called <code>chris_gpt.py</code>:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />Finx_Fine_Tuning
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />data
        ...
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />output
        (empty)
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />utils
        ...
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />.env
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />chris_gpt_dataset_generator.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />chris_gpt_dataset_validation.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />chris_gpt_training.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />chris_gpt.py <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/2728.png" alt="✨" class="wp-smiley" style="height: 1em; max-height: 1em;" />
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />constants.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />Pipfile
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />Pipfile.lock</pre>



<p>Now let&#8217;s start with our imports for <code>chris_gpt.py</code>:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">from constants import CLIENT, OUTPUT_DIRECTORY, CHRIS_GPT
import time</pre>



<p>We import the <code>CLIENT</code>, <code>OUTPUT_DIRECTORY</code>, and <code>CHRIS_GPT</code> model-name from the <code>constants.py</code> file, and we also import the <code>time</code> module so we can easily give our output files unique names.</p>



<p>Now I&#8217;m going to simply declare a string variable that will contain the contents I want our email to have:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">leaf_blower = """
Introduction to the AI-powered leaf blower and its innovative features in the realm of yard maintenance equipment.
Description of how the AI technology enhances the efficiency and performance of the leaf blower compared to traditional models.
Overview of the specific AI algorithms and sensors integrated into the leaf blower for optimized leaf collection and debris management.
Real-world application scenarios demonstrating the effectiveness of the AI-powered leaf blower in various landscaping and gardening tasks.
Discussion on the environmental benefits of using the AI-powered leaf blower, such as reduced noise pollution and energy consumption.
Insights into the future development and potential advancements in AI-powered yard maintenance equipment, including further automation and integration with smart home systems.
"""</pre>



<p>Either copy this from the written version of the tutorial or come up with your own topics that you want Chris to talk about. You can also ask ChatGPT to generate bullet points on a topic for you if want.</p>



<p>Now we&#8217;ll just define a simple chris_gpt function that will take a string of topics and then call our custom model to ask Robo<img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f916.png" alt="🤖" class="wp-smiley" style="height: 1em; max-height: 1em;" />Chris<img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f9d1-1f3fc.png" alt="🧑🏼" class="wp-smiley" style="height: 1em; max-height: 1em;" /> to write an email about them:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">def chris_gpt(topics: str) -> str:
    response = CLIENT.chat.completions.create(
        model=CHRIS_GPT,
        messages=[
            {
                "role": "system",
                "content": "You are a helpful assistant that writes emails for the Finxter email newsletter, adhering perfectly to the style and tone of the Finxter brand and Chris' writing style. You will respond in Simple text format. Don't insert any newline characters and such but use an actual newline. Make sure that the subject makes sense in regards to the content of the email. Keep the email CONCISE AND TO THE POINT, and STAY ON TOPIC. Do not repeat yourself. Don't forget to add Chris' signature emoticons. Also don't make up nonsense terms that do not exist, and make sure you ALWAYS USE CORRECT SPELLING! The user will inform you about the topics of the email:",
            },
            {"role": "user", "content": topics},
        ],
    )

    return (
        response.choices[0].message.content
        or "There was an error with the response. Please try again."
    )</pre>



<p>Note that we edited the prompt a bit from the training data to emphasize conciseness and staying on-topic. Turns out Chris-GPT really goes wild if left unchecked, really talking about anything and everything. It&#8217;s very funny actually. <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f602.png" alt="😂" class="wp-smiley" style="height: 1em; max-height: 1em;" /> I&#8217;ve also decided to ask it for simple text output for now. While we have baked much of the behavior into the model, we can still steer it.</p>



<p>Now let&#8217;s finish off the code. Add the following below and outside the <code>chris_gpt</code> function:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">current_unix_time = int(time.time())

filename = f"chris_gpt_output_{current_unix_time}.txt"

with open(OUTPUT_DIRECTORY / filename, "w", encoding="utf-8") as file:
    file.write(chris_gpt(leaf_blower))</pre>



<p>First, we get the Unix time in seconds, which is a unique number that will be different every second and simply refers to the number of seconds that have passed since 1970. This makes sure that files don&#8217;t overwrite each other as long as we don&#8217;t generate multiple files in the same second.</p>



<p>We then use this to create a filename for our output file. We open the file in write mode and write the output of our <code>chris_gpt</code> function to the file, using the <code>utf-8</code> encoding to make sure we don&#8217;t blow up the code when emoticons are used.</p>



<h3 class="wp-block-heading">Lets put <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f916.png" alt="🤖" class="wp-smiley" style="height: 1em; max-height: 1em;" />Robo<img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f9d1-1f3fc.png" alt="🧑🏼" class="wp-smiley" style="height: 1em; max-height: 1em;" />Chris to work!</h3>



<p>Go ahead and run the file, and your output will appear in the <code>output</code> folder. You can open the file and see the email Chris wrote for you. Here is a random example of what I got:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">Hey {User}! This is one of the most unique products I've seen in a long time.

<img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f342.png" alt="🍂" class="wp-smiley" style="height: 1em; max-height: 1em;" />AI Leaf Blower: A Must-Have Garden Tech for 2022?<img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f341.png" alt="🍁" class="wp-smiley" style="height: 1em; max-height: 1em;" />

I found it on Twitter, 500,000 views in the last 24 hours! It's the next level in leaf blowers. The AI can identify holes, sticks, mud, and leaves.

Perception and decision-making are impressive: It assesses the weight and size of sticks, identifying problematic areas not only by visual information but also friction.

For example, if you collide with something hard, it'll learn from this feedback to avoid those spots in the future.

It also listens to the sound it makes on areas with a lot of leaves compared to an area with just a few leaves in order to gain a sensory perception of the leaf bed to optimize collection and airflow.

Overall a great machine for both pickup and distribution!

It's easy to see more products like this coming out and we're only at the beginning of AI integration in everyday life, first visually (showing the gardens), then providing communication and sensor data based on subscriber inputs.

A systems engineer with 5-10 years of experience could put this together quite easily so you won't see prohibitive costs anymore. This is a massive trend!

Check out the one-minute Twitter AI-Blower-Cheetah in action: <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f447.png" alt="👇" class="wp-smiley" style="height: 1em; max-height: 1em;" />
<img src="https://s.w.org/images/core/emoji/17.0.2/72x72/25b6.png" alt="▶" class="wp-smiley" style="height: 1em; max-height: 1em;" /> AI-Powered Leaf Blower

With cool new tech disrupting every industry, let's be on the right side of change! <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f680.png" alt="🚀" class="wp-smiley" style="height: 1em; max-height: 1em;" /><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f607.png" alt="😇" class="wp-smiley" style="height: 1em; max-height: 1em;" />
Chris <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f33f.png" alt="🌿" class="wp-smiley" style="height: 1em; max-height: 1em;" />

PS: This is the future! Any dirt you throw at it with a shovel, it'll pick up in seconds! <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f342.png" alt="🍂" class="wp-smiley" style="height: 1em; max-height: 1em;" /><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f341.png" alt="🍁" class="wp-smiley" style="height: 1em; max-height: 1em;" /><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/2728.png" alt="✨" class="wp-smiley" style="height: 1em; max-height: 1em;" /><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/26a1.png" alt="⚡" class="wp-smiley" style="height: 1em; max-height: 1em;" /> Scroll up to see the video demonstration if you missed it.</pre>



<p>That is fascinating, right!? My mind is blown. If you compare this with the <code>leaf_blower</code> prompt we fed it, you can see that the original prompt is totally generic, yet this email reads convincingly like a Finxter email written by Chris! It&#8217;s sort of scary almost, but also very cool. <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f92f.png" alt="🤯" class="wp-smiley" style="height: 1em; max-height: 1em;" /></p>



<p>While this is not perfect, and you definitely cannot send these emails without some editing and checking, this really is a talking ChrisGPT now. It is admittedly a bit wild, but that&#8217;s because of our import data, which was a bit messed up because of the formatting and shorter much more attention-grabbing style of speech one uses in emails. We deliberately created a type of idea-generator here intended to be used by the creator themself.</p>



<p>While this is perhaps not a useful commercial application in itself, it clearly demonstrates the power and potential of fine-tuning. It&#8217;s also a lot of fun to play around with!</p>



<p>I think you can easily see that this type of fine-tuning will absorb your business&#8217;s communication tone, as long as you feed it somewhat more boring and professional clean data as opposed to short promotional email data which we didn&#8217;t clean up perfectly.</p>



<h3 class="wp-block-heading">Let&#8217;s have some fun <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f916.png" alt="🤖" class="wp-smiley" style="height: 1em; max-height: 1em;" /><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f9d1-1f3fc.png" alt="🧑🏼" class="wp-smiley" style="height: 1em; max-height: 1em;" /></h3>



<p>I&#8217;ll show you a couple of my other test inputs/outputs:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">super_robot = """
new AI-powered robot:
All-in-one household chore solution
Capable of cleaning, laundry, dishwashing, cooking, and car washing
Utilizes advanced AI and robotics technology
Customizable chore scheduling options
Maximizes efficiency and productivity
Seamlessly integrates with smart home systems
"""</pre>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">Hey {User},

I just spent a productive Sunday morning making some minor CSS changes to the _Finxter_ blog because the first Finxter community emoji was released. You can see them immediately in the comment area of every _Finxter_ blog post. Let's continue with some Finxter industry updates.

The coming years will be exciting for roboticists and us consumers. Many robotic companies are headed for an IPO. If Tesla and Bitcoin 2.0&#x20e3; (i.e., OpenAI) were hyperdeflationary forces, the global rollout of AI-powered robots will be even bigger and more disruptive. <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f92f.png" alt="🤯" class="wp-smiley" style="height: 1em; max-height: 1em;" />

<img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f916.png" alt="🤖" class="wp-smiley" style="height: 1em; max-height: 1em;" /><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f3a8.png" alt="🎨" class="wp-smiley" style="height: 1em; max-height: 1em;" /><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4a6.png" alt="💦" class="wp-smiley" style="height: 1em; max-height: 1em;" /> First, I've ordered a house-cleaning robot from Altan. I'll receive it next month in February.

<img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f6a8.png" alt="🚨" class="wp-smiley" style="height: 1em; max-height: 1em;" /> _Disclosure: As an early Finxter of only [5] Bitcoin_ <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f6a8.png" alt="🚨" class="wp-smiley" style="height: 1em; max-height: 1em;" />

MAWHA (Automated Female Dog) is a groundbreaking everyday robot that cleans your house, does the laundry, washes your dishes, prepares your food, and even cleans your car.

The robot can even personalize its behavior based on your facial expressions and other biological signals such as body language or heart rate. No other robot does this. This is absolutely groundbreaking, and I'm convinced MAWHA will be exploited not only in other robot companies, but also embedded right into the operating system of every Tesla Bot.

Check it out: www.youtube.com/watch?v=O6fGGdUnyjo

I'll keep you posted on this. <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f916.png" alt="🤖" class="wp-smiley" style="height: 1em; max-height: 1em;" /><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4aa.png" alt="💪" class="wp-smiley" style="height: 1em; max-height: 1em;" />

Let me know if your personal bot shall get in touch with my personal bot. Okay, just kidding. But mindblowing developments are right around the corner.

Happy coding!

Chris <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/2665.png" alt="♥" class="wp-smiley" style="height: 1em; max-height: 1em;" />
<img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f9d1-200d-1f4bb.png" alt="🧑‍💻" class="wp-smiley" style="height: 1em; max-height: 1em;" /><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f468-200d-1f4bb.png" alt="👨‍💻" class="wp-smiley" style="height: 1em; max-height: 1em;" /><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4a6.png" alt="💦" class="wp-smiley" style="height: 1em; max-height: 1em;" /></pre>



<p>Now, there is some nonsense in here, but it&#8217;s still pretty good! I have not edited these at all, they are 100% direct ChrisGPT output. Of course, it&#8217;s based on fictional products as we are asking it to write about fictional products here.</p>



<p>I&#8217;ll ask ChrisGPT to write about something factual:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">sharks = """
Briefly introduce the topic of great white sharks and why you're discussing them.
Describe the appearance and size of great white sharks, highlighting their distinctive features.
Discuss where great white sharks are commonly found and their preferred habitats, such as coastal waters and oceanic regions.
Diet and Feeding Behavior: Explain what great white sharks eat and how they hunt, including their role as apex predators in marine ecosystems.
Provide information about the conservation status of great white sharks, including any threats they face and conservation efforts to protect them.
Discuss human interactions with great white sharks, including encounters in the wild, conservation initiatives, and safety measures for beachgoers and divers.
"""</pre>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">Hey {User},

Let's talk about great white sharks today. Nobody knows sharks, which create an ocean environment that is anything but monolithic. Maddening creatures indeed, bringing the oceans alive in a thoughtful way.

Anyway, enjoy the following educational snippets! Feel free to send this quick tutorial to other ocean nerds out there:

<img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f988.png" alt="🦈" class="wp-smiley" style="height: 1em; max-height: 1em;" /> **Great White Sharks** 101

Appearance: The great white shark is one of the most iconic species. Named for its gray coloration and white underside, it has a torpedo-shaped body and impressive size, with an average length of 13-16 feet; 20 feet is common, and some sharks grow even larger.

Distribution: They are found mostly in cool coastal waters of all major oceans, often close to shore, but also travel hundreds of miles into open oceanic regions. Juveniles tend to stay closer to shore, while the adults roam through deeper waters.

Diet and Feeding Behavior: Being top predators for marine ecosystems, great white sharks hunt marine mammals such as seals, sea lions, and small toothed whales. Fast and powerful, they can launch themselves out of the water to catch prey, often attacking from below.

Conservation Status: As an apex species, the health of great white shark populations provides an indication of the health of the overall marine ecosystem. Pressure from overfishing, bycatch, pollution, and loss of prey have taken their toll, and great whites are now considered vulnerable.

Human Interactions: Great white sharks are frequently studied, with many researchers interested in conserving and understanding these animals better. More and more programs have been established to protect the sharks; providing a balanced view on whom we should really fear.

While the chance that you run into a great white is luckily low, I roughly calculated your lifetime risk of dying from any shark attack to be in the same ball park as your lifetime win from a shark lottery of 1 in 1,000. <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f923.png" alt="🤣" class="wp-smiley" style="height: 1em; max-height: 1em;" />

We'll dive into the 2023 edition of Shark Diving technology once available.

Enjoy your breaks today, and keep hitting the books!

Best, Chris. <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f98a.png" alt="🦊" class="wp-smiley" style="height: 1em; max-height: 1em;" /><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/2665.png" alt="♥" class="wp-smiley" style="height: 1em; max-height: 1em;" /><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f517.png" alt="🔗" class="wp-smiley" style="height: 1em; max-height: 1em;" /></pre>



<p>Now you can see it does a pretty good job writing about real and existing topics as well. As you can see it needs some minor edits, but the above is 100% unedited raw output. While talking about a real topic it still uses the fun emoticons and that engaging Chris-specific style of speech that makes it so much more engaging to read. It&#8217;s those things like &#8220;Feel free to send this quick tutorial to other ocean nerds out there:&#8221; that really make me chuckle and make it feel like Chris is really writing this.</p>



<h3 class="wp-block-heading">Be careful what you train the model to do</h3>



<p>Notice that ChrisGPT is sometimes a bit wild and makes stuff up. I left it like that on purpose. The reason it is like this proves an important point here. We trained ChatGPT on data that had only a few bullet points in the user&#8217;s request but way more information in the response.</p>



<p>What this means is we trained ChrisGPT over and over, specifically, to come up with stuff that was not present in the second item which represented the user query, because all the examples we fed it showed ChatGPT responses that had more information than the user request had.</p>



<p>This taught ChrisGPT to include more information that was not present in the original user request because that&#8217;s what all the examples were doing. Now, in our case, that&#8217;s what we wanted, so that is absolutely fine. But when designing your training data be very careful and aware of exactly what you are teaching the model to do. Your input will equal your output here.</p>



<h3 class="wp-block-heading">When not to use fine-tuning</h3>



<p>Before we end this tutorial series, let&#8217;s discuss when to use, and when not to use fine-tuning. You should not use fine-tuning for any tasks where you can get good results using prompt engineering and giving examples in the prompt setup. Most problems can be solved using simple prompt engineering and do not require the extra complexity and time investment of fine-tuning. Do not assume you need fine-tuning just because it sounds fancy.</p>



<p>You will see loads of video tutorials on YouTube that &#8216;fine-tune&#8217; chat GPT to be a sarcastic Reddit commenter etc.. This is completely pointless!! You can do this with simple prompt engineering and perhaps a couple of examples in the prompt setup. This is missing the point of fine-tuning altogether, and the reason is that there isn&#8217;t that much that ChatGPT cannot do yet. Training it to do something it can already do is a waste of time, energy, and money.</p>



<h3 class="wp-block-heading">Single-correct-answer type tasks</h3>



<p>Fine-tuning can be used for very specific and focused tasks. Say you have a very narrow task that you want ChatGPT to do over and over, like extracting very complex CSV or JSON objects from unstructured data, with a lot of edge cases and exceptions, for all of which you&#8217;ll need to provide many examples.</p>



<p>You may find that GPT-4 is powerful enough to just do this, but ChatGPT 3.5-turbo is not quite able to do the same task reliably. Yet, you may still want to use ChatGPT 3.5-turbo for that specific task. Why? Because ChatGPT 3.5 is smaller and therefore much faster and cheaper. If you fine-tune GPT 3.5-turbo on that specific task, then it will generally be able to reach GPT-4 level quality on that specific task. This is a good use of fine-tuning.</p>



<p>This seems to actually reflect a general trend in the LLM space as well, where first the models just got bigger and bigger, but now more and more models are coming out that are specialized for certain specific uses, like writing code or handling math problems, etc, as opposed to having the one model to rule them all. This is a good thing, as it allows for more efficient use of resources and more specialized and accurate results for specific tasks.</p>



<p>Another benefit that you get here is OpenAI will let you submit a <code>testing</code> and a <code>training</code> portion of the data, and after training on the 70% training data, OpenAI will let you know how accurate the model is by testing on the 30% testing data that has not been shown to the model during training so it makes the perfect test.</p>



<p>This is also useful for increasing accuracy on complex function calls and such, as these are also a JSON object output type task, though the models have improved so much with each iteration lately that you probably will not need to fine-tune for this yourself anymore as OpenAI has sort of done this for us already.</p>



<h3 class="wp-block-heading">Customer service chatbots</h3>



<p>Fine-tuning is not a data retrieval system. You might think that you can simply train ChatGPT on your data and it will answer all questions about this data correctly from there on in. Like you feed all your business data and questions and answers from your customer service history and now your fine-tuned model will know everything about your business right? Well, the answer is yes, and no.</p>



<p>I tested this out to make sure my statement here is correct and fine-tuned a model on training data containing over 80 frequently asked questions and their answers from the customer service of a fictional product. Though the fine-tuned model was able to answer most questions correctly, it did give some wrong answers conflicting with the data it was trained on. You cannot use this as a foolproof data retrieval system for your customer service, as being correct 90% of the time there is not good enough.</p>



<p>That being said fine-tuning is being used for customer-service bots, but they should be used in combination with a data retrieval strategy that provides the correct answer to ChatGPT before answering the user, in which case they act as an extra reinforcement where the effect of both is added together. The fine-tuning takes care of the company-specific style of communication, and trains the actual data into the model to some degree, while the retrieval system takes care of the actual data retrieval, feeding the model with the exact and correct information to generate the response for the end user.</p>



<p>The retrieval part is usually achieved by storing the entire dataset cut into pieces in <code>embeddings</code> in an embedding database. You would then retrieve ONLY the pieces of text from your dataset that are similar to the user query to ChatGPT, giving it the perfect source material it needs to give a factually accurate answer. ChatGPT will then generate a response using the retrieved pieces of text it was handed as the source material. If you want to know more about <code>embeddings</code> check out chapters 6 and 7 of my &#8220;function calls and embeddings&#8221; related course <a href="https://academy.finxter.com/university/openai-api-function-calls-and-embeddings/">here on the Finxter Academy</a></p>



<h2 class="wp-block-heading">Thank you for participating!</h2>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="952" height="632" src="https://blog.finxter.com/wp-content/uploads/2024/03/image-92.png" alt="" class="wp-image-1669878" srcset="https://blog.finxter.com/wp-content/uploads/2024/03/image-92.png 952w, https://blog.finxter.com/wp-content/uploads/2024/03/image-92-300x199.png 300w, https://blog.finxter.com/wp-content/uploads/2024/03/image-92-768x510.png 768w" sizes="auto, (max-width: 952px) 100vw, 952px" /></figure>
</div>


<p>That&#8217;s it for the fine-tuning course, I hope you enjoyed it and it has given you a good idea of when you can use fine-tuning in your own projects. As always, it&#8217;s been a pleasure and an honor to take this journey together with you!</p>



<p>Dirk.</p>



<p class="has-base-2-background-color has-background"><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4a1.png" alt="💡" class="wp-smiley" style="height: 1em; max-height: 1em;" /> <strong>Tip</strong>: This is a full-text tutorial on how to fine-tune ChatGPT using the OpenAI API with code! For a video guide-through, check out our premium course &#8220;<a href="https://academy.finxter.com/university/fine-tuning-openai-course/">Fine-Tuning OpenAI: How to Create and Scale Your Digital Self (Example: ChrisGPT)</a>&#8220;</p>
<p>The post <a href="https://blog.finxter.com/openai-fine-tuning/">[Full Tutorial] OpenAI Fine-Tuning: Creating a Chatbot of Yourself (Example: ChrisGPT)</a> appeared first on <a href="https://blog.finxter.com">Be on the Right Side of Change</a>.</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>How to Set Up AutoGen Studio with Docker</title>
		<link>https://blog.finxter.com/how-to-set-up-autogen-studio-with-docker/</link>
		
		<dc:creator><![CDATA[Dirk van Meerveld]]></dc:creator>
		<pubDate>Sun, 11 Feb 2024 21:00:30 +0000</pubDate>
				<category><![CDATA[Artificial Intelligence]]></category>
		<category><![CDATA[AutoGPT]]></category>
		<category><![CDATA[Automation]]></category>
		<category><![CDATA[Large Language Model (LLM)]]></category>
		<category><![CDATA[Prompt Engineering]]></category>
		<guid isPermaLink="false">https://blog.finxter.com/?p=1654847</guid>

					<description><![CDATA[<p>Hi and welcome to this tutorial series on AutoGen Studio. My name is Dirk van Meerveld, and I&#8217;ll be your host and guide for this three part tutorial series where we&#8217;ll take a look at AutoGen Studio. 👉 Go Back to the Full Course: Next Level Prompt Engineering with AutoGen Studio AutoGen Studio is an ... <a title="How to Set Up AutoGen Studio with Docker" class="read-more" href="https://blog.finxter.com/how-to-set-up-autogen-studio-with-docker/" aria-label="Read more about How to Set Up AutoGen Studio with Docker">Read more</a></p>
<p>The post <a href="https://blog.finxter.com/how-to-set-up-autogen-studio-with-docker/">How to Set Up AutoGen Studio with Docker</a> appeared first on <a href="https://blog.finxter.com">Be on the Right Side of Change</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<p>Hi and welcome to this tutorial series on AutoGen Studio. My name is Dirk van Meerveld, and I&#8217;ll be your host and guide for this three part tutorial series where we&#8217;ll take a look at AutoGen Studio.</p>



<p class="has-ast-global-color-4-background-color has-background"><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f449.png" alt="👉" class="wp-smiley" style="height: 1em; max-height: 1em;" /> <a href="https://academy.finxter.com/university/next-level-prompt-engineering-with-autogen-studio/">Go Back to the Full Course: Next Level Prompt Engineering with AutoGen Studio</a></p>



<p>AutoGen Studio is an open-source interface that runs on top of AutoGen. This combination will allow us to work with LLMs and give them skills (like tools/functions) that they can use and also allows us to use agents and even define multi-agent workflows. Think of it as a multi-agent framework. One of the most amazing features is that it can write and execute code on the fly to solve problems!</p>



<p>On top of that AutoGen Studio provides us with a sleek and easy-to-use interface to define all the above and chat with our agents, much of it without even having to write code! This truly is next-level prompt engineering and over the coming lessons, we will harness the power of digital agents and see them collaborate with each other!</p>



<p>In this first part, we&#8217;ll focus on the setup and take some time to properly get started. If you&#8217;re a more seasoned developer and intimately familiar with a particular topic like Docker, feel free to skip ahead a bit here and there, but we&#8217;ll cover all the bases so that everybody can follow along nicely. This is especially so as much of AutoGen Studio can be used via the interface which makes it more accessible to non-coders. We will leave no man or woman behind!</p>



<h2 class="wp-block-heading">AutoGen safety measures</h2>



<p>In order to answer more complex requests AutoGen will actually write Python code for you and execute this Python code on your local computer.</p>



<p>For example, there is a demo question where you ask the following question: <code>"Plot a chart of NVDA and TESLA stock price for 2023. Save the result to a file named nvda_tesla.png"</code>. AutoGen will solve this question by writing Python code to get the stock prices, install needed libraries to execute the code, write code for creating the graph in MathPlotLib, etc… It will then execute this code on your local computer and return the result to you.</p>



<p>The problem occurs where operating systems of course have some kind of built-in security. Generally, the <code>Execution Policy</code> on your OS will prevent AutoGen (and others) from just running random code and scripts on your computer. And this is a good thing! One solution would be to loosen up the <code>Execution Policy</code> on your computer, but this is not really the best idea. Even if you don&#8217;t have an Execution Policy problem on your system, do you really want to have any and all AI-generated code running on your machine without any checks or balances? It&#8217;s a bit dangerous, to say the least.</p>



<h2 class="wp-block-heading">Why we won&#8217;t use Anaconda</h2>



<p>Many tutorials try to take the 100% no-code approach, using Anaconda for virtual environments and not using an IDE like VS Code altogether in an attempt to claim that &#8216;You don&#8217;t need to know any code at all&#8217;. Then you will:</p>



<ul class="wp-block-list">
<li>Still have to install Docker as well as Anaconda, in order to allow safe code execution inside of a Docker container.</li>



<li>Set your API key over and over every time you reload.</li>



<li>Type the same Anaconda commands over and over.</li>



<li>You might be stuck with a buggy experience trying to get it to use Docker for safe code execution, causing you a lot of frustration.</li>
</ul>



<p>While there is absolutely nothing wrong with Anaconda, it doesn&#8217;t remove the need for Docker for safe code execution as it doesn&#8217;t provide any isolation from your local system. So why don&#8217;t we just use Docker instead of Anaconda AND Docker? This biting the bullet will make this part 1 of the tutorial a bit harder, but after that, it will be smooth sailing all the way! <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/26f5.png" alt="⛵" class="wp-smiley" style="height: 1em; max-height: 1em;" /></p>



<h2 class="wp-block-heading">Docker to the rescue</h2>



<p>Using this approach we can skip Anaconda altogether as Docker will also be separate from our system-wide Python environment, and we have only 1 piece of software to worry about for both the separate environment and the safe code execution. We will use some bare basic code in the form of a &#8216;docker file&#8217; and basic commands, but you can just copy mine, so no worries! Again, if you&#8217;re already a Docker expert, feel free to just skip ahead to the Dockerfile and get started.</p>



<p>Docker uses operating system-level virtualization to deliver software in packages called containers. Containers are isolated from one another and bundle their own software, libraries, and configuration files. All containers are run by a single operating system kernel and are thus more lightweight than virtual machines. Crudely stated this means that docker will let us have a separate container with Linux, Python, and all the libraries we need to run AutoGen Studio in it, kind of like running a different computer system within a virtual machine.</p>



<p>We will use this &#8216;virtual machine&#8217;, or Docker container, to run AutoGen Studio inside of it, which automatically also means that any code it generates is executed within the container, solving our code execution policy problems and safety concerns in one go. Caveat: Theoretically even Docker is not 100% secure and isolated from the rest of your system, but it&#8217;s a lot better than just running code directly on your local machine and good enough for non-enterprise use.</p>



<h2 class="wp-block-heading">Getting Docker installed and running &#8211; WSL2</h2>



<p>I&#8217;m going to assume you&#8217;re on a Windows machine. If you&#8217;re on a Mac or Linux machine, you&#8217;re in luck and can probably just run the installer for &#8216;Docker Desktop&#8217; and be done with it. Just head over here and scroll down to select the Docker download for your platform. https://docs.docker.com/desktop/ (or Google for the updated link).</p>



<p>Continuing for the Windows users now, as they have a couple of extra steps to take. Most tutorials will just tell you to install Docker and give no details which might leave you will a hard and long process of figuring out why it won&#8217;t work on your system. So let&#8217;s cover the pitfalls and make sure we get it right the first time!</p>



<p>First, we need to install WSL2 (Windows Subsystem for Linux 2) on our system. This is a Windows feature that allows us to run Linux on our Windows machine. This is needed because Docker Desktop for Windows requires WSL2 to run. (You can also use <a href="https://docs.docker.com/desktop/install/windows-install/">Hyper-V</a>instead, but WSL2 is the recommended way).</p>



<p>(If you already use WSL but need to check if you have version 2, you can check your version by entering the command: <code>wsl -l -v</code> in PowerShell or Windows Command Prompt.)</p>



<p>You can find the details for installing WSL2 <a href="https://learn.microsoft.com/en-us/windows/wsl/install">here</a>, but basically you do this:</p>



<ul class="wp-block-list">
<li>Open a PowerShell or Windows Command Prompt in administrator mode by right-clicking and selecting &#8220;Run as administrator&#8221;</li>



<li>Run the following command in the terminal: <code>wsl --install</code></li>



<li>Restart your computer when prompted</li>
</ul>



<p>This will enable WSL on your Windows machine and install a Linux distribution (Ubuntu) on your system. You may need to create a Username and Password for your Linux distribution and save them somewhere. More details can be found <a href="https://learn.microsoft.com/en-us/windows/wsl/setup/environment#set-up-your-linux-username-and-password">here</a> or google <code>"Setting up your Linux username and password."</code></p>



<h2 class="wp-block-heading">Some things to check before we install Docker Desktop</h2>



<p>A quick caveat: If anywhere along the way you run into problems I haven&#8217;t covered here, you&#8217;ll need to do some googling to find out what it means and what needs to be done, as I cannot prepare you for every possible problem you might run into.</p>



<p>The software development world sometimes requires messing around and googling for an hour or even two before you get something to work, it&#8217;s part of the game. That being said I will try my best to cover all bases so you&#8217;ll hopefully have a very smooth experience <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f600.png" alt="😀" class="wp-smiley" style="height: 1em; max-height: 1em;" />.</p>



<p>We have a couple more things to check before we can install Docker Desktop. First, we need to make sure the Virtual Machine Platform option is enabled. This is fairly easy! Just press the Windows key or click in the Windows search bar and search for &#8220;Turn Windows features on or off&#8221;.</p>



<p>Open this and make sure the &#8220;Virtual Machine Platform&#8221; option is checked. If it&#8217;s not, check it and click OK. You may need to restart your computer after this.</p>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img decoding="async" src="https://academy.finxter.com/wp-content/uploads/2024/02/Screenshot-2024-01-26-195333.png" alt="" class="wp-image-4107"/></figure>
</div>


<p>(You don&#8217;t have to match the other checkboxes with the settings in the image!)</p>



<p>Next, we need to make sure that Virtualization is enabled in our BIOS. The easiest way to check if this is enabled is to open the Windows Task Manager by pressing <code>Ctrl</code>+<code>Alt</code>+<code>Delete</code> and selecting <code>Task Manager</code>. Then click on the <code>Performance</code> tab and click on <code>CPU</code> on the left-hand side. If Virtualization is enabled, you will see &#8220;<code>Virtualization: Enabled</code>&#8221; in the bottom right information block like this:</p>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img decoding="async" src="https://academy.finxter.com/wp-content/uploads/2024/02/Screenshot-2024-01-26-195456.png" alt="" class="wp-image-4108"/></figure>
</div>


<p>If Virtualization is not enabled, you will have to go into your BIOS settings and enable Virtualization. This is where you will have to do some googling and research on your own, as every system has slightly different keys to get into the BIOS setup menu, and the settings may be located in different parts of the BIOS menu for different manufacturers. I&#8217;ll leave you with two links to get you started, the first one describes the general process of getting this setting enabled in your BIOS:</p>



<ul class="wp-block-list">
<li><a href="https://www.virtualmetric.com/blog/how-to-enable-hardware-virtualization">Virtual Metric &#8211; How to enable hardware virtualization</a></li>



<li><a href="https://docs.docker.com/desktop/troubleshoot/topics/#virtualization">Docker Docs &#8211; Troubleshoot topics &#8211; virtualization</a></li>
</ul>



<p>When you&#8217;ve got that ready and set up to go, let&#8217;s continue on.</p>



<h2 class="wp-block-heading">Installing Docker Desktop <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f680.png" alt="🚀" class="wp-smiley" style="height: 1em; max-height: 1em;" /></h2>



<p>Finally, it&#8217;s time! Head over to the Docker Desktop download page <a href="https://www.docker.com/">here</a> and download the appropriate Docker version for your OS. I&#8217;m running Docker <code>4.26.1</code>, but just download the latest version and you should be fine. When the download finishes start the installer. The installer will give you the following options:</p>



<figure class="wp-block-image size-full"><img decoding="async" src="https://academy.finxter.com/wp-content/uploads/2024/02/Screenshot-2024-01-27-094630.png" alt="" class="wp-image-4109"/></figure>



<p>Just accept both of these options and click OK, unless you chose not to install WSL 2 and use Hyper-V instead. Whether or not you want a shortcut on your desktop is entirely up to you of course <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f609.png" alt="😉" class="wp-smiley" style="height: 1em; max-height: 1em;" />.</p>



<p>Now just let the installer do its magic:</p>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img decoding="async" src="https://academy.finxter.com/wp-content/uploads/2024/02/Screenshot-2024-01-27-095115.png" alt="" class="wp-image-4110"/></figure>
</div>


<p>Done!</p>



<p>Then go ahead and run the Docker Desktop application, where you&#8217;ll have to accept the service agreement:</p>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img decoding="async" src="https://academy.finxter.com/wp-content/uploads/2024/02/Screenshot-2024-01-27-095315.png" alt="" class="wp-image-4111"/></figure>
</div>


<p>And then just choose &#8220;Use recommended settings&#8221; and click Finish:</p>


<div class="wp-block-image">
<figure class="aligncenter size-large"><img decoding="async" src="https://academy.finxter.com/wp-content/uploads/2024/02/Screenshot-2024-01-27-095325-1024x933.png" alt="" class="wp-image-4112"/></figure>
</div>


<p>Now Docker Desktop will start and you will be prompted to either sign up or sign in. Docker is free for personal and even small-business use, so press the button to sign up and create an account. You can even use your Google or GitHub account to create one really fast. (You can also continue without signing in if you want to, and it should still work fine). I&#8217;m just going to go ahead and sign in with my Google account.</p>



<p>If everything was successful, you should be greeted by the following screen:</p>


<div class="wp-block-image">
<figure class="aligncenter size-large"><img decoding="async" src="https://academy.finxter.com/wp-content/uploads/2024/02/Screenshot-2024-01-27-100814-1024x702.png" alt="" class="wp-image-4113"/></figure>
</div>


<p>Congratulations! You&#8217;ve now installed Docker Desktop and are ready to go! If you still have problems, first try the below, and if that doesn&#8217;t work, google will have a solution. Never despair!</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">(Only for those who still have problems <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f648.png" alt="🙈" class="wp-smiley" style="height: 1em; max-height: 1em;" /><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f642.png" alt="🙂" class="wp-smiley" style="height: 1em; max-height: 1em;" />)

- Hypervisor enabled at Windows startup -
If you have completed the steps described above and are still experiencing Docker Desktop startup issues, this could be because the Hypervisor is installed, but not launched during Windows startup. Some tools (such as older versions of Virtual Box) and video game installers turn off hypervisor on boot. To turn it back on:

    - Open an administrative console prompt.
    - Run bcdedit /set hypervisorlaunchtype auto.
    - Restart Windows.</pre>



<h2 class="wp-block-heading">Creating a Dockerfile</h2>



<p>Ok, now that we all have Docker Desktop installed and running, let&#8217;s move on to the next step! I&#8217;ll be using VS Code for this, just because it feels convenient to me. You can also use any other code editor or literally just copy the text into Notepad and use a separate terminal window, it makes no difference.</p>



<p>Create a base project directory and open it in VS Code. I&#8217;ll simply call my directory <code>AUTOGEN</code>:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />AUTOGEN (root project folder)</pre>



<p>Now inside the <code>AUTOGEN</code> folder create a new file called <code>Dockerfile.base</code>:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />AUTOGEN (root project folder)
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />Dockerfile.base</pre>



<p>What is a Dockerfile? A Dockerfile is a text document that contains all the commands a user could call on the command line to assemble an image and ultimately run a Docker container. Think of this dockerfile as a recipe that Docker will follow to build a custom image that can be used to create new containers.</p>



<p>In a Dockerfile, you can specify the base image to use, define the working directory, copy files from your local system to the container, run commands to install packages, expose ports for the application, and specify the command that should be run when a container is launched from the image.</p>



<p>So open up your Dockerfile.base, and let&#8217;s type out our Docker recipe! This will be mostly based on the example recommendations from AutoGen itself, with some minor tweaks. First, let&#8217;s specify the base image:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">FROM python:3.11-slim-bookworm</pre>



<p>This will use the official Python image from Docker Hub, which is based on Debian Linux. We&#8217;ll use the <code>slim-bookworm</code> version, which is a lightweight version of Debian Linux. We&#8217;ll also use Python 3.11, which is in the range of current recommended versions for AutoGen at the time of writing.</p>



<p>Then continue in your Dockerfile.base with the following:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">RUN apt-get update \
    &amp;&amp; DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends \
        software-properties-common sudo\
    &amp;&amp; apt-get clean \
    &amp;&amp; rm -rf /var/lib/apt/lists/*</pre>



<p>I&#8217;m going to explain what all the Docker commands do as these topics are interesting and very helpful as a developer, but if you&#8217;re not interested in Dockerfiles and Linux commands and don&#8217;t want to do any coding stuff, you have my blessings to skip the explanations and just scroll down to the finished Dockerfile at the end of this section and copy it into your own Dockerfile.base <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f609.png" alt="😉" class="wp-smiley" style="height: 1em; max-height: 1em;" />.</p>



<ul class="wp-block-list">
<li><code>RUN</code> is a Dockerfile command that executes a command in the container. In this case, we&#8217;re running the <code>apt-get update</code> command, which refreshes the local package index with the latest versions. <code>&amp;&amp;</code> is a command separator, which lets us chain commands.</li>



<li><code>DEBIAN_FRONTEND=noninteractive</code> is an environment variable that is set to <code>noninteractive</code> to prevent the <code>apt-get install</code> command from asking any questions during the installation and just apply default settings.</li>



<li><code>apt-get install</code> is the command to install packages. The <code>-y</code> flag is used to automatically answer yes to any questions that may come up during the installation process. The <code>--no-install-recommends</code> flag is used to prevent the installation of any recommended packages that are not strictly required for the package to function, keeping it light.</li>



<li><code>software-properties-common</code> is a package that provides utilities for managing software repositories. and <code>sudo</code> is a utility that allows us to run commands as a superuser.</li>



<li><code>apt-get clean</code> is a command that cleans up the local repository of retrieved package files, and <code>rm -rf /var/lib/apt/lists/*</code> is a command that removes the package lists that were downloaded during the <code>apt-get update</code> command. This is done to save disk space.</li>
</ul>



<p>Next in our Dockerfile, we&#8217;ll set up a root user with superuser access:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">RUN adduser --disabled-password --gecos '' autogen
RUN adduser autogen sudo
RUN echo '%sudo ALL=(ALL) NOPASSWD:ALL' >> /etc/sudoers
USER autogen
WORKDIR /home/autogen</pre>



<p>The first line creates a new user named <code>autogen</code> without a password and without prompting for additional information, whereas <code>--gecos ''</code> simply sets the user info to an empty string. The next line adds the user <code>autogen</code> to the <code>sudo</code> group, granting it administrative privileges.</p>



<p>The third line configures the <code>sudo</code> group to allow members to execute commands as root without requiring a password. After that, we switch to the <code>autogen</code> user and set the working directory to <code>/home/autogen</code>.</p>



<p>Next up:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">ENV PATH="/home/autogen/.local/bin:$PATH"
ENV OPENAI_API_KEY=paste_your_api_key_here</pre>



<p>First, we set the <code>PATH</code> environment variable (using Docker&#8217;s <code>ENV</code> command) to include the <code>.local/bin</code> directory in the <code>autogen</code> user&#8217;s home directory.</p>



<p>Then we set the <code>OPENAI_API_KEY</code> environment variable to the API key that we got from OpenAI. This is needed so that AutoGen Studio can access the OpenAI API.</p>



<p>Make sure you paste your own ChatGPT API key in there instead of <code>paste_your_api_key_here</code>, making sure not to add <code>""</code> double quotes or anything. You create new or extra keys by going to the <a href="https://platform.openai.com/api-keys">OpenAI API keys page</a>:</p>


<div class="wp-block-image">
<figure class="aligncenter size-large"><img decoding="async" src="https://academy.finxter.com/wp-content/uploads/2024/02/Screenshot-2024-01-27-114746-1024x717.png" alt="" class="wp-image-4114"/></figure>
</div>


<p>Ok now to continue in our Dockerfile:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">RUN pip install pyautogen==0.2.8 autogenstudio==0.0.34a0 numpy pandas matplotlib seaborn scikit-learn requests urllib3 nltk pillow pytest beautifulsoup4</pre>



<p>We just pre-install some of the popular packages and of course pyautogen and autogenstudio themselves. I have specified two specific versions for pyautogen and autogenstudio, as these are the versions I&#8217;m using when writing this tutorial. I advise you to use the same ones to make sure you have exactly the same experience as me. You can always upgrade the packages after you finish the tutorial series. (Don&#8217;t worry, at the end of the last part I&#8217;ll show you how and you won&#8217;t lose any of your work!)</p>



<p>Now next up in our Dockerfile:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""># Expose port
EXPOSE 8081

# Start Command for AutoGen Studio
CMD ["autogenstudio", "ui", "--host", "0.0.0.0", "--port", "8081"]</pre>



<p>The <code>EXPOSE</code> command exposes port 8081, which is the port that AutoGen Studio will run on. The <code>CMD</code> command specifies the command that will be run when the container is launched from the image. In this case, we&#8217;re running the <code>autogenstudio</code> command with the <code>ui</code> option, which will start the AutoGen Studio interface. The <code>--host</code> and <code>--port</code> options specify the host and port that the interface will be available on. This means we&#8217;ll only have to start the container and AutoGen Studio will be available on port 8081 automatically!</p>



<p>Note that we used the address <code>0.0.0.0</code> instead of the usual localhost <code>127.0.0.1</code>. If we bind to the <code>127..</code> version it&#8217;s only accessible from the same machine &#8211; in this case, the same Docker container. To make our application accessible from outside the Docker container, you need to bind it to <code>0.0.0.0</code> instead. This will make it accessible from any IP address, including from your host machine.</p>



<p>To finish off our Dockerfile, we&#8217;ll add a comment at the bottom for our own future reference:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""># command to build the image:
    # docker build -t autogenstudio -f Dockerfile.base .
# command to run the container:
    # docker run -it --rm -p 8081:8081 --name autogenstudio autogenstudio
# Access AutoGen Studio at http://localhost:8081 make sure you don't click the 0.0.0.0:8081 link in the terminal, it won't work!</pre>



<p>This is just a comment that you can use for future reference so you don&#8217;t have to search the tutorial to find the command to build and run the container, we&#8217;ll use and explain these commands in a second, just copy them for your future reference.</p>



<h2 class="wp-block-heading">Building the Docker image and running the container</h2>



<p>Your whole <code>Dockerfile.base</code> should now look like this:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">FROM python:3.11-slim-bookworm

RUN apt-get update \
    &amp;&amp; DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends \
        software-properties-common sudo\
    &amp;&amp; apt-get clean \
    &amp;&amp; rm -rf /var/lib/apt/lists/*

# Setup a non-root user 'autogen' with sudo access
RUN adduser --disabled-password --gecos '' autogen
RUN adduser autogen sudo
RUN echo '%sudo ALL=(ALL) NOPASSWD:ALL' >> /etc/sudoers
USER autogen
WORKDIR /home/autogen

# Set environment variable
ENV PATH="/home/autogen/.local/bin:$PATH"
# Follow the = with your OpenAI API key (no quotes or anything, just OPENAI_API_KEY=sk-lotsOfLettersAndNumbers)
ENV OPENAI_API_KEY=

# Pre-load popular packages
RUN pip install pyautogen==0.2.8 autogenstudio==0.0.34a0 numpy pandas matplotlib seaborn scikit-learn requests urllib3 nltk pillow pytest beautifulsoup4

# Expose port
EXPOSE 8081

# Start Command for AutoGen Studio
CMD ["autogenstudio", "ui", "--host", "0.0.0.0", "--port", "8081"]

# command to build the image:
    # docker build -t autogenstudio -f Dockerfile.base .
# command to run the container:
    # docker run -it --rm -p 8081:8081 --name autogenstudio autogenstudio
# Access AutoGen Studio at http://localhost:8081 make sure you don't click the 0.0.0.0:8081 link in the terminal, it won't work!</pre>



<p>Ok, so with that out of the way, first make sure that the <code>Docker Desktop</code> application is running, as we&#8217;ll need the Docker Engine process to be up and running. Then open up a terminal window and make sure you&#8217;re in the root project folder, in my case <code>AUTOGEN</code>:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">admin@DirkMasterPC /c/Coding_Vault/AUTOGEN</pre>



<p>Then run the following command to build the image:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">docker build -t autogenstudio -f Dockerfile.base .</pre>



<p>This will build the image and tag (<code>-t</code>) it with the name <code>autogenstudio</code>. The <code>-f</code> flag specifies the name of the Dockerfile to use, in this case <code>Dockerfile.base</code>. The <code>.</code> at the end specifies the build context, which is the current directory, as this is where our <code>Dockerfile.base</code> is located.</p>



<p>The Docker Image will be built and you will see the progress running through each of the steps in the Dockerfile. When it is done we can run the following command to start the container which will automatically run AutoGen Studio:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">docker run -it --rm -p 8081:8081 --name autogenstudio autogenstudio</pre>



<p>This works by running the <code>docker run</code> command, which creates a new container from the <code>autogenstudio</code> image that we just built. The <code>-it</code> flag specifies that we want to run the container in interactive mode, which means we can interact with the container via the terminal.</p>



<p>The <code>--rm</code> flag specifies that we want the container to be removed when it is stopped. The <code>-p</code> flag specifies that we want to map port 8081 from the container to port 8081 on our local machine. The <code>--name</code> flag specifies the name of the container, in this case, <code>autogenstudio</code>.</p>



<p>Now bring up your <code>Docker Desktop</code> application and open up the <code>Containers</code> tab (topmost icon in the left menu bar):</p>


<div class="wp-block-image">
<figure class="aligncenter size-large"><img decoding="async" src="https://academy.finxter.com/wp-content/uploads/2024/02/Screenshot-2024-01-27-130521-1024x638.png" alt="" class="wp-image-4115"/></figure>
</div>


<p>We can see that our container by the name <code>autogenstudio</code>, based on the image <code>autogenstudio</code> is running on port <code>8081</code>! Hooray! <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f389.png" alt="🎉" class="wp-smiley" style="height: 1em; max-height: 1em;" /></p>



<h2 class="wp-block-heading">Victory!</h2>



<p>You can either click on the blue <code>8081:8081</code> link to open up AutoGen Studio in your browser, or just open up your browser and go to http://localhost:8081. There is also a link in your terminal window that says that Uvicorn is running on http://0.0.0.0:8081. Do not click this link as the 0.0.0.0:8081 address is only accessible from within the Docker container and you are outside of it, so it will not work.</p>


<div class="wp-block-image">
<figure class="aligncenter size-large"><img decoding="async" src="https://academy.finxter.com/wp-content/uploads/2024/02/Screenshot-2024-01-27-130956-1024x671.png" alt="" class="wp-image-4116"/></figure>
</div>


<p>Give yourself a pat on the back for a job well done! You&#8217;ve set up AutoGen Studio properly and inside a Docker container! One quick sidenote, when we wrote the <code>Dockerfile.base</code> we hardcoded the ChatGPT API key into the Dockerfile to keep things as simple as possible. We can of course avoid this but this would complicate the tutorial further and I want to keep this one as low-code as possible.</p>



<p>Just make sure you do not share the <code>Dockerfile.base</code> or remove your API key from it first as anyone with your API key can obviously use OpenAI&#8217;s API on your credit. Your Docker <code>Image</code> also has the API key hardcoded into it. There is no reason you&#8217;d ever really want to share the image though, just don&#8217;t leave copies behind on a public computer or something.</p>



<p>That&#8217;s it for part 1! I hope it wasn&#8217;t too difficult. From now on the fun part begins. I&#8217;ll see you in part 2 where we&#8217;ll dive straight into AutoGen Studio. <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f60e.png" alt="😎" class="wp-smiley" style="height: 1em; max-height: 1em;" /></p>



<p class="has-ast-global-color-4-background-color has-background"><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f449.png" alt="👉" class="wp-smiley" style="height: 1em; max-height: 1em;" /> <strong>Go Back to the Full Course: </strong><a href="https://academy.finxter.com/university/next-level-prompt-engineering-with-autogen-studio/">Next Level Prompt Engineering with AutoGen Studio</a></p>
<p>The post <a href="https://blog.finxter.com/how-to-set-up-autogen-studio-with-docker/">How to Set Up AutoGen Studio with Docker</a> appeared first on <a href="https://blog.finxter.com">Be on the Right Side of Change</a>.</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>OpenAI Whisper &#8211; Speeding Up or Outsourcing the Processing</title>
		<link>https://blog.finxter.com/openai-whisper-speeding-up-or-outsourcing-the-processing/</link>
		
		<dc:creator><![CDATA[Dirk van Meerveld]]></dc:creator>
		<pubDate>Thu, 25 Jan 2024 19:57:21 +0000</pubDate>
				<category><![CDATA[Artificial Intelligence]]></category>
		<category><![CDATA[Large Language Model (LLM)]]></category>
		<category><![CDATA[OpenAI]]></category>
		<category><![CDATA[Prompt Engineering]]></category>
		<category><![CDATA[Speech Recognition and Generation]]></category>
		<guid isPermaLink="false">https://blog.finxter.com/?p=1654500</guid>

					<description><![CDATA[<p>🎙️ Course: This article is based on a lesson from our Finxter Academy Course Voice-First Development: Building Cutting-Edge Python Apps Powered By OpenAI Whisper. Check it out for video lessons, GitHub, and a downloadable PDF course certificate with your name on it! Hi and welcome back! In this part, we&#8217;re going to look at some ... <a title="OpenAI Whisper &#8211; Speeding Up or Outsourcing the Processing" class="read-more" href="https://blog.finxter.com/openai-whisper-speeding-up-or-outsourcing-the-processing/" aria-label="Read more about OpenAI Whisper &#8211; Speeding Up or Outsourcing the Processing">Read more</a></p>
<p>The post <a href="https://blog.finxter.com/openai-whisper-speeding-up-or-outsourcing-the-processing/">OpenAI Whisper &#8211; Speeding Up or Outsourcing the Processing</a> appeared first on <a href="https://blog.finxter.com">Be on the Right Side of Change</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<p class="has-base-2-background-color has-background"><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f399.png" alt="🎙" class="wp-smiley" style="height: 1em; max-height: 1em;" /> <strong>Course</strong>: This article is based on a lesson from our <strong>Finxter Academy Course</strong> <a href="https://academy.finxter.com/university/openai-whisper/"><em>Voice-First Development: Building Cutting-Edge Python Apps Powered By OpenAI Whisper</em></a>. Check it out for video lessons, GitHub, and a downloadable PDF course certificate with your name on it!</p>



<p>Hi and welcome back! In this part, we&#8217;re going to look at some alternatives to speed stuff up or outsource the processing power to OpenAI&#8217;s servers altogether. First, we&#8217;ll look at <code>faster-whisper</code> at a basic level. If you&#8217;re not sure whether you want to use this you can also just watch this part and decide whether or not you want to install it for further use later as we&#8217;re just going to cover it quickly before moving on to the web API version for the rest of this part.</p>



<p>So what is <code>faster-whisper</code>? Faster-Whisper is a quicker version of OpenAI&#8217;s Whisper speech-to-text model. As OpenAI released the <code>whisper</code> model as open-source this has naturally allowed others to try to build on and optimize it further. It uses CTranslate2, a fast engine for Transformer models, and is up to 4 times faster and uses considerably less memory than the original openai/whisper while claiming to maintain the same accuracy. You can find the GitHub repository <a href="https://github.com/SYSTRAN/faster-whisper">here</a>.</p>



<p>You can use this for the same apps we have built so far, just as a faster version of the Whisper model, so we won&#8217;t be building a new app specifically for this, as it would get repetitive and I don&#8217;t want to waste your time! You just need some syntax changes to make your app work with faster-whisper instead of the original whisper model. So we&#8217;ll take a look at the basics of fast-whisper, let you decide if you want to use/implement it, and then move on to the web-API version.</p>



<h2 class="wp-block-heading">Installing faster-whisper</h2>



<p>Note: If you do not plan on using faster-whisper or are not quite sure, there is no point in going through the install procedures, and you can skip ahead a couple of minutes to the web-API version, or just watch/read along and decide later if you want to use it.</p>



<p>Basically, to install faster-whisper you just have to run the following command in your terminal:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">pip install faster-whisper</pre>



<p>And to support GPU execution you need to have the appropriate libraries for CUDA installed, which are <a href="https://developer.nvidia.com/cublas">cuBLAS</a> and <a href="https://developer.nvidia.com/cudnn">cuDNN</a>. This can be the slightly trickier part of the install, and again I cannot really give you platform-specific instructions or help you with the specific troubleshooting if you run into challenges. As always in software development, if you&#8217;re lucky you won&#8217;t have any problems, and if you&#8217;re not, you spend some time on Google and Stackoverflow to find the solution. If you just want to run faster-whisper on your CPU, which will of course be slower but may not be a big deal for small-scale development on your own machine, you can skip the <code>cuBLAS</code> and <code>cuDNN</code> installs.</p>



<h2 class="wp-block-heading">Using faster-whisper</h2>



<p>So let&#8217;s give it a spin to see how it works! First create a new file in your project root directory called <code>4_faster_whisper.py</code>:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />FINX_WHISPER (project root folder)
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />output_temp_files
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />output_video
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />styles
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />test_audio_files
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />utils
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />1_basic_call_english_only.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />1_multiple_languages.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />2_whisper_pods.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />3_subtitle_master.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />4_faster_whisper.py   (<img src="https://s.w.org/images/core/emoji/17.0.2/72x72/2728.png" alt="✨" class="wp-smiley" style="height: 1em; max-height: 1em;" />new file)
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />settings.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />.env</pre>



<p>And inside let&#8217;s start with our imports:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">from faster_whisper import WhisperModel
from settings import TEST_AUDIO_DIR

model_size = "small"</pre>



<p>We import the <code>WhisperModel</code> class from the <code>faster_whisper</code> package, and the <code>TEST_AUDIO_DIR</code> variable from our <code>settings.py</code> file, and then set a string variable to the value <code>small</code>. Like whisper, faster-whisper also comes with different sizes of models. Using the same naming convention we have <code>tiny.en</code>, <code>base.en</code>, <code>small.en</code>, and <code>medium.en</code> as our English-only models. For the multi-language models, we can choose between <code>tiny</code>, <code>base</code>, <code>small</code>, <code>medium</code>, or one of several versions of the full-size model, namely: <code>large-v1</code>, <code>large-v2</code>, <code>large-v3</code>, or <code>large</code>.</p>



<p>Next, we&#8217;ll create a new instance of the <code>WhisperModel</code> class, picking only one of the two options below:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">model = WhisperModel(model_size, device="cpu", compute_type="int8")
# Choose only one of these, depending on if you're running on CPU or GPU (cuda). (I'll be using the second option)
model = WhisperModel(model_size, device="cuda", compute_type="float16")</pre>



<p>More options are available, like running on <code>cuda</code> using <code>int8_float16</code> or even using <code>float32</code>, see <a href="https://opennmt.net/CTranslate2/quantization.html">here</a> for more details.</p>



<p>The <code>.transcribe</code> method for faster-whisper is slightly different:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">segments, info = model.transcribe(
    str(TEST_AUDIO_DIR / "dutch_long_repeat_file.mp3"),
    beam_size=5,
)</pre>



<p>As you can see we get two returns when calling <code>model.transcribe</code> instead of the single dictionary output we had before. The first is a list of <code>segments</code> which contains the transcription. The second is a <code>NamedTuple</code> (a <code>Tuple</code> with named fields) which allows us to access information like the language (<code>info.language</code>), language probability (<code>info.language_probability</code>), etc. So let&#8217;s add some print statements to print the information and then the transcription itself to the console:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">print(f"Detected language '{info.language}' with probability {info.language_probability}")

for segment in segments:
    print(f"[{segment.start:.2f}s -> {segment.end:.2f}s] {segment.text}")</pre>



<p>The first print statement just has us access some of the properties of the <code>info</code> object we discussed. The second print statement loops over the list of <code>segments</code>, and for each <code>segment</code> it will print the segment&#8217;s start time, end time, and the text of the segment itself. The <code>:.2f</code> is a formatting string that tells Python to print the number with two decimal places, for example: <code>1.23</code> instead of <code>1.23456789</code>.</p>



<p>One interesting thing to note here though is that <code>segments</code> is not actually a list. Segments is a generator, which is a different type of iterable. What this means is that the segments will be generated when you request them and not beforehand. In other words, the transcription only begins when we iterate over the <code>segments</code> and not before. Calling <code>.transcribe()</code> on our model did not start the transcription as vanilla whisper did. You can either loop over the <code>segments</code> as we did above, or you can convert the generator to a list by converting it to a list <code>list(segments)</code>.</p>



<p>One of the nice things about this generator is that we can very easily see the live transcription and print it to the console while it is still generating, which is exactly what this code will do. So let&#8217;s run it and see what happens:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">Estimating duration from bitrate, this may be inaccurate
Detected language 'nl' with probability 0.931703
[0.00s -> 3.04s]  Hoi allemaal, dit is weer een testbestandje.
[3.04s -> 6.88s]  Deze keer om te testen of de Nederlandse taal goed herkent gaat worden.
[6.88s -> 12.68s]  Hierna kunnen we ook proberen deze tekst te laten vertalen naar het Engels om te zien hoe goed dat gaat.
[12.68s -> 13.88s]  Ik ben benieuwd.
[13.88s -> 16.84s]  Hoi allemaal, dit is weer een testbestandje.
[16.84s -> 20.72s]  Deze keer om te testen of de Nederlandse taal goed herkent gaat worden.
[20.72s -> 26.48s]  Hierna kunnen we ook proberen deze tekst te laten vertalen naar het Engels om te zien hoe goed dat gaat.
[26.48s -> 27.68s]  Ik ben benieuwd.
[27.68s -> 30.72s]  Hoi allemaal, dit is weer een testbestandje.
[30.72s -> 34.60s]  Deze keer om te testen of de Nederlandse taal goed herkent gaat worden.
[34.60s -> 40.36s]  Hierna kunnen we ook proberen deze tekst te laten vertalen naar het Engels om te zien hoe goed dat gaat.
[40.36s -> 41.52s]  Ik ben benieuwd.</pre>



<p>You can see the output streaming to the console as the model transcribes. Unless you run over CPU you will also notice a pretty good speed. Now as you&#8217;re probably not Dutch I&#8217;ll just tell you the transcription above is perfect except for the one small (<code>herkent/herkend</code>) issue we had before, but as you know this can be fixed by loading a larger model size.</p>



<p>Play around with any audio file you want and see what model size you need. If you use English files pick a <code>.en</code> model for greater efficiency. Also be aware that you can pass in options into the <code>.transcribe</code> method much like the vanilla whisper model, for instance:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">segments, info = model.transcribe(
    str(TEST_AUDIO_DIR / "dutch_long_repeat_file.mp3"),
    beam_size=5,
    word_timestamps=True,  # uncomment this line to get word timestamps
    # without_timestamps=True,  # uncomment this line to get rid of timestamps and just transcribe
)</pre>



<p>In conclusion, faster-whisper is a nice optimization to look into if you&#8217;re considering deploying this model in a production application somewhere. There are also other optimized versions of the whisper model out there that you can check out, like <a href="https://github.com/huggingface/distil-whisper">distil-whisper</a>. Play around and see which gives you the best trade-offs between speed and accuracy. I&#8217;ll leave the rest up to you as we move on from faster-whisper to check out the web-API version.</p>



<h2 class="wp-block-heading">Web-API version</h2>



<p>Another option we have is to simply not deploy the model anywhere but outsource this to OpenAI&#8217;s fast servers. This is kind of like making a ChatGPT call except we request a transcription instead of a chat completion. The OpenAI servers are also very optimized for machine-learning calculations (obviously) and as you&#8217;ll see they are therefore quite fast!</p>



<p>So let&#8217;s take a look at the pricing first. The cost for using the Whisper API is $0.006 per minute transcribed, rounded to the nearest second. This means a 20-minute video would cost you $0.12. This is a good solution if you don&#8217;t want to deploy the model yourself, perhaps your application will only be used occasionally and it&#8217;s simply not worth it to invest that much into having a model running somewhere. For a high-use application dealing with longer files and many users, this is not the way to go though.</p>



<p>So let&#8217;s take a quick look at how this would work practically, by building one last quick application, but this time using the web API. Our application will take any video in any language as input and will return a short quiz with questions about the video. First, create a new file in your <code>utils</code> folder named <code>openai_api.py</code>:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />FINX_WHISPER (project root folder)
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />output_temp_files
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />output_video
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />styles
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />test_audio_files
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />utils
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />command.py
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />openai_api.py   (<img src="https://s.w.org/images/core/emoji/17.0.2/72x72/2728.png" alt="✨" class="wp-smiley" style="height: 1em; max-height: 1em;" />new file)
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />podcast.py
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />subtitles.py
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />video.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />1_basic_call_english_only.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />1_multiple_languages.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />2_whisper_pods.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />3_subtitle_master.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />4_faster_whisper.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />settings.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />.env</pre>



<p>Inside <code>openai_api.py</code>, let&#8217;s start with our imports and some basic setup:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">import typing
from pathlib import Path

from decouple import config
from openai import OpenAI


CLIENT = OpenAI(api_key=str(config("OPENAI_API_KEY")))
MODEL = "whisper-1"

ResponseFormat = typing.Literal["text", "srt", "vtt"]</pre>



<p>We&#8217;ll use <code>typing</code> to define our allowed response formats. The rest is all imports we have used before, <code>config</code> as we&#8217;ll need to load our API key and <code>OpenAI</code> to call the APIs for Whisper and ChatGPT. We create our <code>CLIENT</code> just like last time and we save the <code>MODEL</code> in a string variable, <code>whisper-1</code> is the only option for the Whisper API for now.</p>



<p>Finally, we define a type alias named <code>ResponseFormat</code> which is a <code>Literal</code> type, which means it can only be one of the three strings we have defined, <code>text</code>, <code>srt</code>, or <code>vtt</code>. We can use this as a type hint later to indicate that if a particular variable is of type <code>ResponseFormat</code> then it should have one of these three values and nothing else. (<code>json</code> and <code>verbose_json</code> are also possible if you prefer JSON object output, but we will be skipping them as they are useless for our purposes.)</p>



<p>Now we&#8217;ll define our transcription utility function:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">def transcribe(
    file: Path,
    language: str | None = None,
    translate: bool = False,
    response_format: ResponseFormat = "text",
) -> str:

    print("Transcribing file...")
    options = {
        "file": file,
        "model": MODEL,
        "response_format": response_format,
    }

    if translate:
        transcript = CLIENT.audio.translations.create(**options)
    else:
        if language:
            options["language"] = language
        transcript = CLIENT.audio.transcriptions.create(**options)

    if type(transcript) != str:
        raise TypeError(
            f"Expected a string value to be returned, but got {type(transcript)} instead."
        )
    print(f"Transcription successful:\n{transcript[:100]}...")

    return transcript</pre>



<p>We define a function called <code>transcribe</code> which takes a <code>file</code> of type <code>Path</code>, a <code>language</code> of type <code>str</code> or <code>None</code>, which defaults to <code>None</code>, in which case the API will try to detect the language automatically. We also have a <code>translate</code> boolean which defaults to <code>False</code>, and a <code>response_format</code> which has to be of type <code>ResponseFormat</code>, so one of the three values we defined in the type alias, and defaults to <code>text</code>. The function returns a string.</p>



<p>We print a message to indicate the transcription is starting and then create a dictionary named options in which we pass in some options that are needed for both a translation and a transcript call, so the shared options if you will. These are the <code>file</code>, <code>model</code>, and <code>response_format</code>. If the user requests a translation we call the <code>CLIENT.audio.translations.create</code> method, passing in the <code>**options</code> dictionary as arguments as is. If <code>translation</code> = <code>False</code> it must be a transcription. For transcriptions, we can add the <code>language</code> key to the options dictionary to specify the language, but if the user didn&#8217;t provide it we can leave it out and it will just take a bit longer to do the auto-detection. This time we call the <code>CLIENT.audio.transcriptions.create</code> method, again passing in the <code>**options</code> dictionary which optionally now contains the <code>language</code> key.</p>



<p>Finally, we check if the <code>transcript</code> is a string, and if not we raise a <code>TypeError</code> to indicate something went wrong, just to make sure the user is not requesting JSON from this endpoint, which is possible and would crash the rest of our code. Otherwise, we print a message to indicate the transcription was successful and return the <code>transcript</code>.</p>



<h2 class="wp-block-heading">Video to Quiz</h2>



<p>As we&#8217;re going to be building a video-to-quiz app, we need one more utility function inside this <code>openai_api.py</code> file, which will take a transcript and generate some questions for us. Continue below the <code>transcribe</code> function:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">PROMPT_SETUP = """You are a text-to-quiz app. The user will provide you a video transcription in textual format. You will generate a list of questions for the user to answer about this video. Depending on the length of the transcription, stick to a maximum of 5 questions. All questions should be solely about the video transcription content provided by the user and should be answerable by reading the transcription. Do not provide the answers, but only the questions. The transcription the user provides is based on a video, and may include timestamps, please ignore these timestamps and just treat it as one single transcription containing all the content in the video.
List and number each item on a separate line.
"""

from tenacity import retry, stop_after_attempt, stop_after_delay</pre>



<p>First, we define a constant to hold the prompt setup instruction for ChatGPT. Just go ahead and copy mine. It&#8217;s a fairly basic setup that asks for questions related to the video so we can make a quiz tailor-made for the input video. We also import <code>retry</code>, <code>stop_after_attempt</code>, and <code>stop_after_delay</code> from the <code>tenacity</code> package. (Go ahead and move the tenacity imports line to the top of your file with the other imports instead of here in the middle.) We can use these to make our code a bit more robust when calling APIs or taking actions that do not have a 100% success rate. It&#8217;s fairly easy to use and I just want to show you that this tool is out there, you&#8217;ll see how it works in a second.</p>



<p>Let&#8217;s code up the function:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">def text_to_quiz(text: str) -> str:
    print("Converting text to quiz...")
    messages = [
        {"role": "system", "content": PROMPT_SETUP},
        {"role": "user", "content": text},
    ]
    result = CLIENT.chat.completions.create(
        model="gpt-3.5-turbo-1106",
        messages=messages,
    )
    content = result.choices[0].message.content
    if content == None:  # Just a quick sanity check
        raise ValueError("There was an error while trying to generate the quiz.")
    print(f"Text to quiz conversion completed.")
    return content</pre>



<p>Our function takes a string which is the transcription and returns a string as output. We create a list of messages with the first being the system message, holding our <code>PROMPT_SETUP</code>, and the second being the user message which has the transcription as its content. We then call the <code>CLIENT.chat.completions.create</code> method, passing in the <code>model</code> and <code>messages</code> as arguments. We&#8217;ll use <code>gpt-3.5-turbo-1106</code> which is the newest gpt-3.5 model out there and is frankly good enough. You can use gpt-4 but make sure you consider the cost, it is considerably more expensive and not really needed for this use case. If you&#8217;re worried about the lower maximum input size, or &#8216;context window&#8217; of gpt-3.5, know that it has a 16k context limit that can easily handle long video transcriptions, though most are not really as long as you might think they are.</p>



<p>We then access the <code>content</code> of the first choice&#8217;s message in the <code>result</code> object, which should hold our quiz. We do a quick sanity check to make sure we received a valid response, and then print a message to indicate the conversion was successful and return the <code>content</code>.</p>



<p>So that&#8217;s pretty simple, right? But what if we get no content back? Do we really want to just raise an error and give up immediately? Let&#8217;s use the tenacity library so we can try again in case of a failure. The only single thing we have to change is to add the <code>@retry</code> decorator before our function, the only thing that changes is the first line:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">@retry(stop=stop_after_attempt(3) | stop_after_delay(60))
def text_to_quiz(text: str) -> str:
    print("Converting text to quiz...")
    messages = [
        {"role": "system", "content": PROMPT_SETUP},
        {"role": "user", "content": text},
    ]
    result = CLIENT.chat.completions.create(
        model="gpt-3.5-turbo-1106",
        messages=messages,
    )
    content = result.choices[0].message.content
    if content == None:  # Just a quick sanity check
        raise ValueError("There was an error while trying to generate the quiz.")
    print(f"Text to quiz conversion completed.")
    return content</pre>



<p>And just like that, our function is set up to try up to three times or (<code>|</code>) for a max of 60 seconds, just in case the API call fails for some reason. Notice how easy it is to use the Tenacity library. This is not required but it&#8217;s a nice way to make your code more robust just in case.</p>



<h2 class="wp-block-heading">Putting it all together</h2>



<p>That&#8217;s our <code>openai_api.py</code> file done! Go ahead and save and close it. Now let&#8217;s create a new file in our project root directory called <code>4_vid_to_quiz.py</code> to put it all together:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />FINX_WHISPER (project root folder)
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />output_temp_files
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />output_video
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />styles
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />test_audio_files
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />utils
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />command.py
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />openai_api.py
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />podcast.py
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />subtitles.py
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />video.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />1_basic_call_english_only.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />1_multiple_languages.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />2_whisper_pods.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />3_subtitle_master.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />4_faster_whisper.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />4_vid_to_quiz.py   (<img src="https://s.w.org/images/core/emoji/17.0.2/72x72/2728.png" alt="✨" class="wp-smiley" style="height: 1em; max-height: 1em;" />new file)
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />settings.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />.env</pre>



<p>Inside <code>4_vid_to_quiz.py</code> let&#8217;s start with our imports:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">import os
import uuid
from pathlib import Path

import gradio as gr

from settings import BASE_DIR, OUTPUT_TEMP_DIR, STYLES_DIR
from utils import openai_api, video


API_UPLOAD_LIMIT_BYTES = 26214400  # 25mb</pre>



<p>We will use <code>os</code> to check the size of the file we will upload, as there is a size limit to the API. We have some imports you&#8217;ve seen before, and some of our directories from the <code>settings</code> file plus our <code>openai_api</code> and <code>video</code> utilities. We also define a constant <code>API_UPLOAD_LIMIT_BYTES</code> which is the maximum size of the file we can upload to the API, which is 25 MB.</p>



<p>Let&#8217;s start with a quick function to check if the file is not too big:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">def check_upload_size(input_file: str) -> None:
    """Check the video file size is within the API upload limit."""
    input_file_size = os.path.getsize(input_file)
    if input_file_size > API_UPLOAD_LIMIT_BYTES:
        raise ValueError(
            f"File size of {input_file_size} bytes ({input_file_size / 1024 / 1024:.2f} MB) exceeds the API upload limit of {API_UPLOAD_LIMIT_BYTES} bytes ({API_UPLOAD_LIMIT_BYTES / 1024 / 1024:.2f} MB). Please use a shorter video or lower the audio quality settings."
        )</pre>



<p>We take an input file path as a string and then use <code>os.path.getsize</code> to get the size of the file in bytes, and then check if it is larger than our <code>API_UPLOAD_LIMIT_BYTES</code>. If it is, we raise a <code>ValueError</code> to indicate the file is too large. We also print a message to indicate the file size and the API upload limit. That&#8217;s all there is to this function.</p>



<p>Let&#8217;s move on to our <code>main</code> function:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">def main(input_video: str) -> str:
    """Takes a video file as string path and returns a quiz as string."""
    unique_id = uuid.uuid4()

    mp3_file = video.to_mp3(
        input_video,
        log_directory=BASE_DIR,
        output_path=OUTPUT_TEMP_DIR / f"{unique_id}.mp3",
        mono=True,
    )

    check_upload_size(mp3_file)
    transcription = openai_api.transcribe(
        Path(mp3_file), language="en", translate=False, response_format="text"
    )

    quiz = openai_api.text_to_quiz(transcription)
    return quiz</pre>



<p>This is the function the gradio button will call when clicked. It takes an input_video as string input and will return the quiz in string format. We don&#8217;t really care about the name of the mp3 file we&#8217;ll extract from the video here so we just use a <code>uuid</code> to make it unique. Now we use our <code>video.to_mp3</code> utility function from the previous part to extract the audio from the video.</p>



<p>We pass in the <code>input_video</code> as the video file, our project root directory as the <code>log_directory</code>, and our <code>output_path</code> is the <code>OUTPUT_TEMP_DIR</code> with the <code>uuid</code> and <code>.mp3</code> extension pasted on. Finally, this is the time to use the <code>mono</code> option we built into the <code>to_mp3</code> function but didn&#8217;t use last time. So far the size of our files has not been that important, but now that we have a web API it suddenly becomes relevant.</p>



<p>Whisper down-mixes audio to mono before processing anyway, and the API has an upload limit of roughly 25MB per transcription request. So we can save a lot of space by dropping the channels to 1, from stereo to mono audio, which allows us to make much longer requests as we can drastically lower the bitrate with only 1 audio channel.</p>



<p>Sending stereo audio would exceed the file limit after about 20 minutes of audio at 192kbps quality. We more than halved the quality to 80kbps which is still considered decent quality for mono mp3 files and allows us to transcribe way longer files. You can also try playing with the other audio quality settings or lower the bitrate even further to 64kbps for mono if you want to go even further.</p>



<p>After that, we run our <code>check_upload_size</code> check to make sure the file is not too large, and then we call our <code>openai_api.transcribe</code> function, passing in the <code>mp3_file</code> as the <code>file</code>, <code>language="en"</code> as the language, <code>translate=False</code> as we don&#8217;t want to translate, and <code>response_format="text"</code> as we want the transcription in text format. We then call our <code>openai_api.text_to_quiz</code> function, passing in the <code>transcription</code> as the <code>text</code> and returning the resulting <code>quiz</code>.</p>



<h2 class="wp-block-heading">Gradio Interface</h2>



<p>Finally, we&#8217;ll create our gradio interface:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">if __name__ == "__main__":
    block = gr.Blocks(
        css=str(STYLES_DIR / "vid2quiz.css"),
        theme=gr.themes.Soft(primary_hue=gr.themes.colors.yellow),
    )

    with block:
        with gr.Group():
            gr.HTML(
                f"""
                &lt;div class="header">
                &lt;img src="https://i.imgur.com/oEtZKEh.png" referrerpolicy="no-referrer" class="header-img" />
                &lt;/div>
                """
            )
            with gr.Row():
                input_video = gr.Video(
                    label="Input Video", sources=["upload"], mirror_webcam=False
                )
                output_quiz_text = gr.Textbox(label="Quiz")
            with gr.Row():
                button_text = "<img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4dd.png" alt="📝" class="wp-smiley" style="height: 1em; max-height: 1em;" /> Make a quiz about this video! <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4dd.png" alt="📝" class="wp-smiley" style="height: 1em; max-height: 1em;" />"
                btn = gr.Button(value=button_text, elem_classes=["button-row"])

            btn.click(main, inputs=[input_video], outputs=[output_quiz_text])

    block.launch(debug=True)</pre>



<p>All of this will be familiar by now, I just used a different CSS file we&#8217;ll have to create, and used a slightly different <code>primary_hue</code> for the team than last time. The &#8216;imgur&#8217; image link has changed as well to give you a new header logo and below that, we just take an input video and have an output <code>Textbox</code>. Our button has a CSS class of <code>button-row</code> again so we can style it and clicking the button runs the function with the input video and the output going to the output textbox.</p>



<p>Let&#8217;s add the CSS file to our <code>styles</code> folder:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />FINX_WHISPER (project root folder)
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />output_temp_files
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />output_video
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />styles
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />subtitle_master.css
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />vid2quiz.css      (<img src="https://s.w.org/images/core/emoji/17.0.2/72x72/2728.png" alt="✨" class="wp-smiley" style="height: 1em; max-height: 1em;" />new file)
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />whisper_pods.css
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />test_audio_files
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />utils
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />command.py
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />openai_api.py
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />podcast.py
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />subtitles.py
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />video.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />1_basic_call_english_only.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />1_multiple_languages.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />2_whisper_pods.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />3_subtitle_master.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />4_faster_whisper.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />4_vid_to_quiz.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />settings.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />.env</pre>



<p>And inside <code>vid2quiz.css</code> let&#8217;s add the following:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">.header {
  display: flex;
  justify-content: center;
  align-items: center;
  padding: 2em 8em;
}

.header-img {
  max-width: 50%;
}

.header,
.button-row {
  background-color: #0c1d36;
}</pre>



<p>We use <code>flex</code> to center the header image vertically and horizontally and apply the usual padding. We give the <code>header-img</code> class a <code>max-width</code> of 50% so it doesn&#8217;t take up the entire width of the screen. Finally, we give the <code>header</code> and <code>button-row</code> classes a background color of <code>#0c1d36</code> which is a dark blue color.</p>



<p>Ok, you know the drill, let&#8217;s run it and see what happens!</p>


<div class="wp-block-image">
<figure class="aligncenter size-large"><img decoding="async" src="https://academy.finxter.com/wp-content/uploads/2024/01/4_vid2quiz_interface-1024x531.png" alt="" class="wp-image-4066"/></figure>
</div>


<p>Ok, looking good, so let&#8217;s upload a video and then request a quiz about it. I used a random video from YouTube, namely <a href="https://www.youtube.com/watch?v=fb-58KobeFU">Hot Dr Pepper from the 1960s</a>, just because it showed up when I opened the YouTube website. Let&#8217;s see how it does:</p>



<figure class="wp-block-image size-large"><img decoding="async" src="https://academy.finxter.com/wp-content/uploads/2024/01/4_vid2quiz_output-1024x587.png" alt="" class="wp-image-4065"/></figure>



<p>Perfect, exactly what we wanted, and this was all powered by the OpenAI API! You&#8217;ll also notice it was probably reasonably fast, considering it had to convert the whole video and then transcribe it and generate a quiz.</p>



<p>One important limitation of the app in this particular form is that it can handle videos up to about ~48 minutes in length (with the 80kbps mono settings), because of the upload limit. If you want to handle longer videos you could split them up and put the transcripts back together, but honestly, if you&#8217;re going to be handling files of that length you&#8217;re probably better off deploying the model yourself to save cost as it is calculated per minute of audio.</p>



<p>A fun idea is that you can also use the translation option in our <code>utils.get_transcription</code> function to have foreign language videos as input and then English questions about the foreign language video as output. This could be cool for a foreign language learning app or test.</p>



<p>So that&#8217;s it for the whisper course. I hope you enjoyed it and now have a good idea of how to use Whisper, what you can use it for, and the various deployment options. The next step is up to you and limited only by your imagination!</p>



<p>As always, it was an honor and a pleasure to take this journey together, and I hope to see you next time!</p>



<h2 class="wp-block-heading">Full Course: OpenAI Whisper &#8211; Building Cutting-Edge Python Apps with OpenAI Whisper</h2>



<p>Check out our full OpenAI Whisper course with video lessons, easy explanations, GitHub, and a downloadable PDF certificate to prove your speech processing skills to your employer and freelancing clients:</p>


<div class="wp-block-image">
<figure class="aligncenter size-full"><a href="https://academy.finxter.com/university/openai-whisper/"><img loading="lazy" decoding="async" width="908" height="257" src="https://blog.finxter.com/wp-content/uploads/2024/01/image-154.png" alt="" class="wp-image-1654506" srcset="https://blog.finxter.com/wp-content/uploads/2024/01/image-154.png 908w, https://blog.finxter.com/wp-content/uploads/2024/01/image-154-300x85.png 300w, https://blog.finxter.com/wp-content/uploads/2024/01/image-154-768x217.png 768w" sizes="auto, (max-width: 908px) 100vw, 908px" /></a></figure>
</div>


<p><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f449.png" alt="👉" class="wp-smiley" style="height: 1em; max-height: 1em;" /> [<strong>Academy</strong>] <a href="https://academy.finxter.com/university/openai-whisper/" data-type="link" data-id="https://academy.finxter.com/university/openai-whisper/">Voice-First Development: Building Cutting-Edge Python Apps Powered By OpenAI Whisper</a></p>
<p>The post <a href="https://blog.finxter.com/openai-whisper-speeding-up-or-outsourcing-the-processing/">OpenAI Whisper &#8211; Speeding Up or Outsourcing the Processing</a> appeared first on <a href="https://blog.finxter.com">Be on the Right Side of Change</a>.</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>OpenAI Whisper Example &#8211; Building a Subtitle Generator &#038; Embedder</title>
		<link>https://blog.finxter.com/openai-whisper-example-building-a-subtitle-generator-embedder/</link>
		
		<dc:creator><![CDATA[Dirk van Meerveld]]></dc:creator>
		<pubDate>Thu, 25 Jan 2024 19:57:05 +0000</pubDate>
				<category><![CDATA[Artificial Intelligence]]></category>
		<category><![CDATA[Large Language Model (LLM)]]></category>
		<category><![CDATA[OpenAI]]></category>
		<category><![CDATA[Speech Recognition and Generation]]></category>
		<guid isPermaLink="false">https://blog.finxter.com/?p=1654504</guid>

					<description><![CDATA[<p>🎙️ Course: This article is based on a lesson from our Finxter Academy Course Voice-First Development: Building Cutting-Edge Python Apps Powered By OpenAI Whisper. Check it out for video lessons, GitHub, and a downloadable PDF course certificate with your name on it! Welcome back to part 3, where we&#8217;ll use Whisper to build another really ... <a title="OpenAI Whisper Example &#8211; Building a Subtitle Generator &#038; Embedder" class="read-more" href="https://blog.finxter.com/openai-whisper-example-building-a-subtitle-generator-embedder/" aria-label="Read more about OpenAI Whisper Example &#8211; Building a Subtitle Generator &#038; Embedder">Read more</a></p>
<p>The post <a href="https://blog.finxter.com/openai-whisper-example-building-a-subtitle-generator-embedder/">OpenAI Whisper Example &#8211; Building a Subtitle Generator &#038; Embedder</a> appeared first on <a href="https://blog.finxter.com">Be on the Right Side of Change</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<p class="has-base-2-background-color has-background"><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f399.png" alt="🎙" class="wp-smiley" style="height: 1em; max-height: 1em;" /> <strong>Course</strong>: This article is based on a lesson from our <strong>Finxter Academy Course</strong> <a href="https://academy.finxter.com/university/openai-whisper/"><em>Voice-First Development: Building Cutting-Edge Python Apps Powered By OpenAI Whisper</em></a>. Check it out for video lessons, GitHub, and a downloadable PDF course certificate with your name on it!</p>



<p>Welcome back to part 3, where we&#8217;ll use Whisper to build another really cool app. In this part, we&#8217;ll look at how to work with video files. After all, many of the practical applications of speech recognition don&#8217;t come in convenient MP3 files, but rather in video files. We&#8217;ll be building a subtitle generator and embedder, which will take a video file as input, transcribe it, and then embed the subtitles into the video file itself, feeding the result back to the end user.</p>



<p>Before we can get started on the main code, we will need to write some utilities again, just like in the previous part. The utilities we&#8217;ll need this time are:</p>



<ul class="wp-block-list">
<li>Subtitles -&gt; We just can reuse the subtitle-to-disk utility from the previous part. (Done<img src="https://s.w.org/images/core/emoji/17.0.2/72x72/2714.png" alt="✔" class="wp-smiley" style="height: 1em; max-height: 1em;" />)</li>



<li>Video -&gt; We will need a way to convert a video file to an mp3 file so that we can feed it to Whisper.</li>



<li>Commands -&gt; We will need a way to run commands on the command line, as there are multiple ffmpeg commands we&#8217;ll need to run both for the video conversion and the subtitle embedding.</li>
</ul>



<p>So let&#8217;s get started with the command utility. Inside the <code>utils</code> folder, first create a new file named <code>command.py</code>:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />FINX_WHISPER (project root folder)
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />output_temp_files
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />output_video
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />styles
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />test_audio_files
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />utils
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />__init__.py
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />podcast.py
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />subtitles.py
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />command.py   (<img src="https://s.w.org/images/core/emoji/17.0.2/72x72/2728.png" alt="✨" class="wp-smiley" style="height: 1em; max-height: 1em;" />new file)
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />1_basic_call_english_only.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />1_multiple_languages.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />2_whisper_pods.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />settings.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />.env</pre>



<p>Then inside the <code>command.py</code> file let&#8217;s start with our imports:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">import datetime
import subprocess
from pathlib import Path</pre>



<p>We&#8217;re going to run commands and provide some very basic logging as well. We imported the <code>datetime</code> module so we can add timestamps to our logs, and pathlib should be familiar by now. The <code>subprocess</code> module in Python is used to spawn new processes, connect to their input/output/error pipes, and obtain their return codes. It allows you to execute system commands and interact with them programmatically. It&#8217;s basically a bit like opening a terminal window inside your Python code.</p>



<p>Next, we&#8217;ll start with an extremely simple function that will print a message but in blue letters:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">def print_blue(message: str) -> None:
    print(f"\033[94m{message}\033[00m")</pre>



<p>The <code>\033[94m</code> and <code>\033[00m</code> are ANSI escape codes, which are used to add color and formatting to text in terminal output. The <code>94</code> is the code for blue, and the <code>00</code> is the code for reset. You can find a list of all the codes here: https://en.wikipedia.org/wiki/ANSI_escape_code#Colors. We will print the commands we execute to the terminal in blue, which helps them stand out from the other white text output and makes it easier for us to check our commands.</p>



<h2 class="wp-block-heading">Running system commands</h2>



<p>Next, we&#8217;ll create a function that will run a command like you would run on the command line:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">def run_and_log(command: str, log_directory: Path) -> None:
    print_blue(f"Running command: \n{command}")
    with open(log_directory / "commands_log.txt", "a+", encoding="utf-8") as file:
        subprocess.call(
            command,
            stdout=file,
            stderr=file,
        )
        file.write(
            f"\nRan command: {command}\nDate/time: {datetime.datetime.now()}\n\n\n\n"
        )</pre>



<p>We create a function called <code>run_and_log</code>, which takes two arguments: <code>command</code> which is a string, and <code>log_directory</code> which is a Path and indicates the directory where we want to save the log file. We then print the command we&#8217;re about to execute in blue, and then open the log file in append mode. The <code>a+</code> means that we will append to the file if it exists, and create it if it doesn&#8217;t. Again, we use the <code>encoding="utf-8"</code> argument to make sure that we can write non-ASCII characters to the file as well. If you do not do this you will eventually run into trouble.</p>



<p>Inside the <code>with open</code> context manager, so while the file is open, we call the <code>subprocess.call</code> function. This function takes a command as input and executes it, so as the first argument we pass the <code>command</code> variable. The second argument is <code>stdout=file</code>, which means that we will write the output of the command to the file (instead of the console). The third argument is <code>stderr=file</code>, which means that we will write any errors to the file as well. So we basically execute the command and whatever output there is gets logged inside the text file.</p>



<p>After that, we write what command we executed and a timestamp to the file, and use a couple of <code>\n</code> to add some newlines to the file so that the next command will be lower down, making them easy to distinguish from each other.</p>



<p>Now let&#8217;s run a quick test, using the extremely simple terminal command <code>echo 'hello'</code>, which will simply print <code>hello</code> to the console. Let&#8217;s run this command and see if our function works:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">run_and_log("echo 'hello'", Path.cwd())</pre>



<p>For the path we&#8217;ve used the <code>Path.cwd()</code> method in Python&#8217;s <code>pathlib</code> module which returns the current working directory as a <code>Path</code> object. This is the terminal&#8217;s current directory when you run the script. (This is just for a quick test, we don&#8217;t want to go through the trouble of importing the base directory in here)</p>



<p>Go ahead and run the <code>command.py</code> file, and whatever directory your terminal was in when you ran the script should now have a file named <code>commands_log.txt</code> with the following inside:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">hello

Ran command: echo 'hello'
Date/time: 2024-01-14 12:13:49.535692</pre>



<p>It worked! We&#8217;ve successfully logged the output of <code>hello</code> followed by our logging information of the time and command executed. Make sure you remove or comment out the <code>run_and_log</code> line before we continue, as we don&#8217;t want to run this command every time we run the script.</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""># run_and_log("echo 'hello'", Path.cwd())</pre>



<h2 class="wp-block-heading">A peculiar issue with slashes</h2>



<p>With our <code>run_and_log</code> function completed, we have just one more function to create in here. There is a small discrepancy between the file paths where ffmpeg will expect a different format for the system commands than our Python code will give us. So we need to write a short utility to fix the path. This issue only occurs with the subtitle path when trying to embed the subtitles using ffmpeg system commands, and I&#8217;m honestly not sure why it occurs, but this is the type of thing you will run into during your software development journey.</p>



<p>If you keep looking you&#8217;ll always find a solution, never despair, but I&#8217;ll save you this time and tell you about the issue ahead of time!</p>



<ul class="wp-block-list">
<li>The path <code>C:\Users\dirk\test/subtitle.vtt</code> will not work in the command and will give errors as it get&#8217;s messed up and then is unable to be parsed as a valid path.\</li>



<li>What we need is <code>C\:\\Users\\dirk\\test\\subtitle.vtt</code> instead. Notice there is an extra <code>\</code> after the <code>C</code> and after every <code>\</code> in the path. The first <code>\</code> is an escape character, which means that the second <code>\</code> is not interpreted as a special character but as a literal <code>\</code>.</li>



<li>This issue only affects the subtitle path and not the input or output video paths, so we only need to fix the subtitle path.</li>
</ul>



<p>Below the <code>run_and_log</code> function inside the <code>command.py</code> file, add a new function:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">def format_ffmpeg_filepath(path: Path) -> str:
    """Turns C:\Users\dirk\test/subtitle.vtt into C\:\\Users\\dirk\\test\\subtitle.vtt"""
    string_path = str(path)
    return string_path.replace("\\", "\\\\").replace("/", "\\\\").replace(":", "\\:")</pre>



<p>We take a <code>Path</code> as input, and then first convert it to a string so we can use string methods on it to fix the format. We then use the <code>replace</code> method to replace all the <code>\</code> with <code>\\</code> and all the <code>/</code> with <code>\\</code>. We also replace the <code>:</code> with <code>\:</code>. Now I see you looking mighty confused! Why so many slashes? Well, remember the first <code>\</code> is the escape character so that the second slash is interpreted not as an operator but as a literal slash string-character.</p>



<ul class="wp-block-list">
<li>So in order to replace <code>\</code> we need to target it using <code>\\</code>, as we need the escape character to indicate we want to target the <code>\</code> string-character and not use it as an operator, so a single <code>\</code> won&#8217;t work as it would be interpreted as the slash operator.</li>



<li>Likewise, to replace it with <code>\\</code> we need to use <code>\\\\</code> as each slash typed needs a slash to escape it, so that each second slash is interpreted as a literal slash string-character.</li>



<li>So the above function just means that <code>\</code> is replaced by <code>\\</code>, <code>/</code> is replaced by <code>\\</code>, and <code>:</code> is replaced by <code>\:</code>. It just looks so confusing because of all the extra escape characters which also happen to be slashes! Phew<img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f92f.png" alt="🤯" class="wp-smiley" style="height: 1em; max-height: 1em;" />.</li>
</ul>



<h2 class="wp-block-heading">Video utility functions</h2>



<p>Okay so with that out of the way, go ahead and save and close the <code>command.py</code> file. It&#8217;s time for our <code>video</code> utility file next, so create a new file called <code>video.py</code> inside the utils folder:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />FINX_WHISPER (project root folder)
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />output_temp_files
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />output_video
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />styles
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />test_audio_files
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />utils
            <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />__init__.py
            <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />podcast.py
            <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />subtitles.py
            <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />command.py
            <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />video.py   (<img src="https://s.w.org/images/core/emoji/17.0.2/72x72/2728.png" alt="✨" class="wp-smiley" style="height: 1em; max-height: 1em;" />new file)
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />1_basic_call_english_only.py
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />1_multiple_languages.py
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />2_whisper_pods.py
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />settings.py
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />.env</pre>



<p>Don&#8217;t worry, this one won&#8217;t be so bad <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f600.png" alt="😀" class="wp-smiley" style="height: 1em; max-height: 1em;" />! Open up your new <code>video.py</code> file and let&#8217;s start with our imports:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">from pathlib import Path
from . import command</pre>



<p>All we need is <code>Path</code> for input argument type-hinting and the <code>command</code> module we just created. Next, we&#8217;ll create a function that will convert a video file to an mp3 file so it can be fed to Whisper:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">def to_mp3(
    input_video: str, log_directory: Path, output_path: Path, mono: bool = False
) -> str:
    output_path_string = str(output_path)

    channels = 1 if mono else 2
    bitrate = 80 if mono else 192

    command_to_run = f'ffmpeg -i "{input_video}" -vn -ar 44100 -ac {channels} -b:a {bitrate}k "{output_path_string}"'
    command.run_and_log(command_to_run, log_directory)
    print(f"Video converted to mp3 and saved to {output_path_string}")

    return output_path_string</pre>



<p>We define a function named <code>to_mp3</code> which takes an <code>input_video</code> as a string, a <code>log_directory</code> as a Path, an output_path as a Path, and a <code>mono</code> option as a boolean. The function returns a string in the end, which holds the output path. The <code>input_video</code> path is a string because gradio will feed it to us, which is why it is not a <code>Path</code> object like the <code>log_directory</code> and <code>output_path</code>. Make sure you always keep track of what type all the variables are or you will run into trouble eventually passing in a Path object where a string is expected, or vice versa.</p>



<p>First, we get a string version of the <code>output_path</code> and save it in <code>output_path_string</code>. Then we check if the <code>mono</code> option is set to <code>True</code> or <code>False</code>, and set the <code>channels</code> and <code>bitrate</code> variables accordingly. If <code>mono</code> is <code>True</code> we set <code>channels</code> to <code>1</code> and <code>bitrate</code> to <code>80</code>, and if <code>mono</code> is <code>False</code> we set <code>channels</code> to <code>2</code> and <code>bitrate</code> to <code>192</code>. We won&#8217;t actually need this mono option until part 4, but we might as well add it now.</p>



<p>Then we get to the command, first preparing it in a variable named <code>command_to_run</code>. We use the <code>ffmpeg</code> command and pass in the <code>input_video</code> as the input file (<code>-i</code>). We then use the <code>-vn</code> option to disable video recording, the <code>-ar</code> option to set the audio sampling frequency to 44100 Hz, the <code>-ac</code> option to set the number of audio channels to <code>channels</code>, and the <code>-b:a</code> option to set the audio bitrate to <code>bitrate</code> kbps. We then pass in the <code>output_path_string</code> as the output file location.</p>



<p>Notice that the command is contained inside an f-string which has single quotes on the outside (<code>f'command'</code>). Make sure you imitate this perfectly, using the single quotes on the outside and the double quotes around the variable names of <code>"{input_video}"</code> and <code>"{output_path_string}"</code>. We need these double quotes because the user input video file is likely to have spaces in the name, and not having double quotes around a name with spaces inside will cause the command to fail.</p>



<p>Then we call the <code>run_and_log</code> function from our <code>command</code> module, passing in the command and the directory we want to log to, printing a message to the console, and returning the output_path_string.</p>



<p>That completes our <code>video.py</code> file, go ahead and save and close it. We&#8217;re ready to start on the main code now!</p>



<h2 class="wp-block-heading">Subtitle Master &#8211; Putting it all together</h2>



<p>In your root folder, create a new file named <code>3_subtitle_master.py</code>:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />FINX_WHISPER (project root folder)
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />output_temp_files
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />output_video
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />styles
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />test_audio_files
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />utils
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />__init__.py
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />podcast.py
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />subtitles.py
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />command.py
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />video.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />1_basic_call_english_only.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />1_multiple_languages.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />2_whisper_pods.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />3_subtitle_master.py   (<img src="https://s.w.org/images/core/emoji/17.0.2/72x72/2728.png" alt="✨" class="wp-smiley" style="height: 1em; max-height: 1em;" />new file)
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />settings.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />.env</pre>



<p>Inside, let&#8217;s start with our imports:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">import os
import uuid

import gradio as gr
import whisper
from whisper.utils import WriteVTT

from settings import BASE_DIR, OUTPUT_TEMP_DIR, OUTPUT_VIDEO_DIR, STYLES_DIR
from utils import command, subtitles, video</pre>



<p>We import <code>os</code> to do some filename splitting, and all the other imports are familiar from previous parts. To finish up we import several directories from our <code>settings</code> file and the <code>command</code>, <code>subtitles</code>, and <code>video</code> modules from our <code>utils</code> folder, reusing the <code>subtitles</code> module from the previous part.</p>



<p>Next up are our constants for the file:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">MODEL = whisper.load_model("base.en")
VTT_WRITER = WriteVTT(output_dir=str(OUTPUT_TEMP_DIR))</pre>



<p>We just load up a model, I&#8217;ll start with <code>base.en</code> as it will probably be good enough to get started. Then we instantiate a <code>WriteVTT</code> object like we did last time, indicating we want to save the subtitles in the temp directory.</p>



<p>As we are going to be returning a video to the end user this time, I would like to include the original video name in the output file, though we&#8217;ll still need a uuid as well to guarantee unique names (the user might upload the same file twice!). So let&#8217;s create a quick function that gets us a unique project name. Say the user inputs a file named <code>my_video.mp4</code>, we want the function to return <code>my_video_0f646333-0464-43a1-a75c-ed57c47fbcd5</code> so that we basically have a uuid with the filename in front of it. We can then add <code>.mp3</code> or <code>.srt</code> or whatever file extension we need at the end, making sure all the files for this project have the same but unique project name.</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">def get_unique_project_name(input_video: str) -> str:
    """Get a unique subtitle-master project name to avoid file-name clashes."""
    unique_id = uuid.uuid4()
    filename = os.path.basename(input_video)
    base_fname, _ = os.path.splitext(filename)
    return f"{base_fname}_{unique_id}"</pre>



<p>The function takes the input path as a string and then generates a <code>uuid</code>. We then get the filename using <code>os.path.basename</code>, which takes a path like <code>C:\Users\dirk\test\my_video.mp4</code> and returns <code>my_video.mp4</code>. We then use <code>os.path.splitext</code> to split the filename into a base filename and an extension, so <code>my_video.mp4</code> becomes <code>my_video</code> and <code>.mp4</code>. We catch the base name as <code>base_fname</code> and the extension under the variable name <code>_</code> as we don&#8217;t need it. We then return the base filename with the uuid appended to it.</p>



<p>Now let&#8217;s get started on our main function below that will tie it all together:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">def main(input_video: str) -> str:
    """Takes a video file as string path and returns a video file with subtitles embedded as string path."""
    unique_project_name = get_unique_project_name(input_video)
    get_temp_output_path = lambda ext: OUTPUT_TEMP_DIR / f"{unique_project_name}{ext}"
    mp3_file = video.to_mp3(
        input_video,
        log_directory=BASE_DIR,
        output_path=get_temp_output_path(".mp3"),
    )</pre>



<p>We&#8217;ll take an input video, which gradio will pass to our main function as a string path. The function will return a string path pointing towards the processed video file with embedded subtitles back to gradio. First, we get a unique project name using the function we just wrote. Then we create a simple lambda function like the one we had in part 2. It takes an extension like <code>.mp3</code> as input and returns <code>output_dir/project_name.mp3</code>, as we&#8217;ll need temporary directories for both our <code>.mp3</code> and our <code>.vtt</code> files, and this way we only have one place to change if we ever need to change the output directory.</p>



<p>Then we call the <code>to_mp3</code> function from our <code>video</code> module, passing in the input video, the project&#8217;s base directory as the log directory, and the output path as the <code>get_temp_output_path</code> lambda function with <code>.mp3</code> as the extension. We save the return of the function as the variable named <code>mp3_file</code>.</p>



<p>Continuing on:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">def main(input_video: str) -> str:
    ...previous code...

    whisper_output = MODEL.transcribe(mp3_file, beam_size=5)
    vtt_subs = subtitles.write_to_file(
        whisper_output,
        writer=VTT_WRITER,
        output_path=get_temp_output_path(".vtt"),
    )</pre>



<p>We call the <code>transcribe</code> method on our <code>MODEL</code> object, which has an instance of Whisper, passing in the <code>mp3_file</code> as the input file, and setting the <code>beam_size</code> to <code>5</code>. We then call the <code>write_to_file</code> function from our <code>subtitles</code> module, passing in the <code>whisper_output</code> as the transcript, the <code>VTT_WRITER</code> as the writer, and the <code>get_temp_output_path</code> lambda function with <code>.vtt</code> as the extension as the output path.</p>



<p>So what is this <code>beam_size</code> parameter? Well, it&#8217;s one of a number of possible parameters we can pass into the <code>transcribe</code> method. The <code>beam_size</code> parameter is the number of beams to use in the beam search. The higher the number, the more accurate the transcription will be, but the slower it will be as well. The default is <code>5</code>, and I&#8217;ve found that this is a good balance between speed and accuracy. The only reason I&#8217;ve passed it in explicitly here is to make you aware of these parameters. It basically refers to the number of different potential paths that will be explored, from which the most likely one is chosen. Here are some of the other possible parameters:</p>



<ul class="wp-block-list">
<li><code>temperature</code> -&gt; The higher the temperature, the more likely it is that the model will choose a less likely character. You can think of it in a similar way as the <code>temperature</code> setting you get with ChatGPT calls. The default is <code>0</code> and will simply always return the most likely predictions only, <code>0</code> is what we have been using so far.</li>



<li><code>beam_size</code> -&gt; The number of beams to use in the beam search. We just discussed this one above. It is only applicable when the temperature is set to <code>0</code>, and its default value is <code>5</code>.</li>



<li><code>best_of</code> -&gt; Selects multiple random samples, only for use with a nonzero temperature and will generate more diverse (and possibly wrong) samples.</li>



<li><code>task</code> -&gt; Either <code>transcribe</code> or <code>translate</code>. We&#8217;ve used this one before and it defaults to <code>transcribe</code>.</li>



<li><code>language</code> -&gt; The language to use when <code>task</code> = <code>translation</code>. Defaults to <code>None</code> which will perform a language detection first.</li>



<li><code>device</code> -&gt; The device to use for inference. Defaults to <code>cuda</code> if you have a cuda enabled GPU, otherwise, it will default to <code>cpu</code>.</li>



<li><code>verbose</code> -&gt; Whether to print out the progress and debug messages, defaults to <code>True</code>.</li>
</ul>



<p>And there are more. For general use, you&#8217;ll probably do fine with the defaults most of the time, but be aware that you can tweak these parameters to get better results if you need to.</p>



<p>Back to our code, let&#8217;s continue:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">def main(input_video: str) -> str:
    ...previous code...

    vtt_string_path = command.format_ffmpeg_filepath(vtt_subs)
    output_video_path = OUTPUT_VIDEO_DIR / f"{unique_project_name}_subs.mp4"
    embed_subs_into_vid_command = f'ffmpeg -i "{input_video}" -vf "subtitles=\'{vtt_string_path}\'" "{output_video_path}"'

    command.run_and_log(embed_subs_into_vid_command, log_directory=BASE_DIR)

    return str(output_video_path)</pre>



<p>We need to run another <code>ffmpeg</code> system command to embed the subtitles we have created into our video file. We first get the <code>vtt_string_path</code> by passing in the <code>vtt_subs</code> path we already have into that crazy function with all the <code>////</code> backslashes we called <code>format_ffmpeg_filepath</code>, remember? After that, we save our desired output video path in a variable by just combining our <code>OUTPUT_VIDEO_DIR</code> with the <code>unique_project_name</code> and pasting <code>_subs.mp4</code> at the end for good measure.</p>



<p>Now we prepare the <code>ffmpeg</code> command we&#8217;re about to run in a separate variable for readability. We use the <code>input_video</code> as the input file (<code>-i</code>), and then use the <code>-vf</code> option to add a video filter. The video filter we use is <code>subtitles</code> and we pass in the <code>vtt_string_path</code> as the subtitle file. We then pass in the <code>output_video_path</code> as the output file.</p>



<p>Notice again that the whole command is inside single brackets <code>'</code> inside of which we have path variables in double brackets <code>"</code> to avoid trouble if there are spaces in the filename. But as we have to pass in <code>"subtitles='{vtt_string_path}'"</code> which requires another level of brackets again, going back to the single brackets <code>'</code> would cause trouble as we have already used these to open the string variable at the start, so we have to escape them using the backslash <code>\'</code> instead.</p>



<p>Then we call the <code>run_and_log</code> function from our <code>command</code> module, passing in the command we just wrote, and the <code>BASE_DIR</code> as the log directory. We then return the <code>output_video_path</code> as a string, as gradio doesn&#8217;t want a Path object.</p>



<p>The whole <code>main</code> function now looks like this:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">def main(input_video: str) -> str:
    """Takes a video file as string path and returns a video file with subtitles embedded as string path."""
    unique_project_name = get_unique_project_name(input_video)
    get_temp_output_path = lambda ext: OUTPUT_TEMP_DIR / f"{unique_project_name}{ext}"
    mp3_file = video.to_mp3(
        input_video,
        log_directory=BASE_DIR,
        output_path=get_temp_output_path(".mp3"),
    )

    whisper_output = MODEL.transcribe(mp3_file, beam_size=5)
    vtt_subs = subtitles.write_to_file(
        whisper_output,
        writer=VTT_WRITER,
        output_path=get_temp_output_path(".vtt"),
    )

    vtt_string_path = command.format_ffmpeg_filepath(vtt_subs)
    output_video_path = OUTPUT_VIDEO_DIR / f"{unique_project_name}_subs.mp4"
    embed_subs_into_vid_command = f'ffmpeg -i "{input_video}" -vf "subtitles=\'{vtt_string_path}\'" "{output_video_path}"'

    command.run_and_log(embed_subs_into_vid_command, log_directory=BASE_DIR)

    return str(output_video_path)</pre>



<h2 class="wp-block-heading">Building the interface</h2>



<p>Now all we need to do to run this is create another gradio interface. As you are already familiar with gradio now we&#8217;ll go through this one a bit more quickly, the principles are the same as last time. Below your main function, continue with:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">if __name__ == "__main__":
    block = gr.Blocks(
        css=str(STYLES_DIR / "subtitle_master.css"),
        theme=gr.themes.Soft(primary_hue=gr.themes.colors.emerald),
    )

    with block:
        with gr.Group():
            gr.HTML(
                f"""
                &lt;div class="header">
                &lt;img src="https://i.imgur.com/dxHMfCI.png" referrerpolicy="no-referrer" />
                &lt;/div>
                """
            )
            with gr.Row():
                input_video = gr.Video(
                    label="Input Video", sources=["upload"], mirror_webcam=False
                )
                output_video = gr.Video()
            with gr.Row():
                button_text = "<img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f39e.png" alt="🎞" class="wp-smiley" style="height: 1em; max-height: 1em;" /> Subtitle my video! <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f39e.png" alt="🎞" class="wp-smiley" style="height: 1em; max-height: 1em;" />"
                btn = gr.Button(value=button_text, elem_classes=["button-row"])

            btn.click(main, inputs=[input_video], outputs=[output_video])

    block.launch(debug=True)</pre>



<p>We use the <code>if __name__ == "__main__":</code> guard to make sure that the code inside only runs when we run the file directly. We create the gradio <code>block</code> object just like we did before, passing in a <code>css</code> file that doesn&#8217;t exist yet, but this time we also pass in a <code>theme</code>. I&#8217;ll pass in the <code>gr.themes.Soft()</code> which has a bit of a different style to it, and set the accent color to emerald by passing in <code>primary_hue=gr.themes.colors.emerald</code> when calling <code>Soft()</code>. This will match nicely with the logo I have prepared for you with this application.</p>



<p>Then we open the <code>block</code> object using the with statement, and open up a new <code>Group</code> inside of it, just like we did before, so we can build our block interface. The HTML object is the same as in the last part, except I changed the image link URL to give you a new logo for this app. Then we open up a new <code>Row</code> and add a <code>Video</code> object for the input video, passing in <code>sources=["upload"]</code> so that the user can upload a video file, and setting <code>mirror_webcam=False</code> as we don&#8217;t want to take the user&#8217;s webcam as input. Still on the same <code>Row</code>, so next to the input video, we declare another <code>Video</code> object for the output video file.</p>



<p>We then have a row that only has a button for which we provide a text and a class of <code>button-row</code> so we can target it with CSS. The <code>btn.click</code> declaration is a lot simpler this time as we just call the <code>main</code> function with only a single input of <code>input_video</code> and only one output of <code>output_video</code>. Finally, we call <code>.launch</code> on the block just like last time.</p>



<p>That&#8217;s our code done! You&#8217;re probably dying to run it, but wait! We have to create a quick CSS file to finish it off. Create a new file named <code>subtitle_master.css</code> inside the <code>styles</code> folder:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />FINX_WHISPER (project root folder)
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />output_temp_files
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />output_video
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />styles
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />subtitle_master.css   (<img src="https://s.w.org/images/core/emoji/17.0.2/72x72/2728.png" alt="✨" class="wp-smiley" style="height: 1em; max-height: 1em;" />new file)
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />whisper_pods.css
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />test_audio_files
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />utils
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />__init__.py
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />podcast.py
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />subtitles.py
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />command.py
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />video.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />1_basic_call_english_only.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />1_multiple_languages.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />2_whisper_pods.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />3_subtitle_master.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />settings.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />.env</pre>



<p>Inside we&#8217;ll just write some quick CSS styles:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">.header {
  padding: 2em 8em;
}

.header,
.button-row {
  background-color: #1d366f7e;
}</pre>



<p>We just gave the <code>header</code> class some padding to stop the logo image from being too large and then gave both the <code>header</code> and <code>button-row</code> classes a background color of <code>#1d366f7e</code> which is a nice dark blue half-transparent color. Save and close the file, and we&#8217;re ready to run! Go ahead and run the <code>3_subtitle_master.py</code> file, and give it some time to load. Click the link in your terminal window again to open the interface in your browser, and you should see something like this:</p>



<figure class="wp-block-image size-large"><img decoding="async" src="https://academy.finxter.com/wp-content/uploads/2024/01/3_subtitle_master-1024x543.png" alt="" class="wp-image-4061"/></figure>



<p>Yours won&#8217;t have Korean in the input video box though, but whatever your computer&#8217;s language is set to. Go ahead and upload a video file, wait a second for it to load, and then press the <code>subtitle my video</code> button. This may take quite a while if you&#8217;re not on the fastest system with a powerful GPU, but you&#8217;ll see the commands and steps being executed in your terminal window just like we set up. Eventually, you&#8217;ll see the output video appear with the subtitles embedded, each one perfectly in time with the video, and you can play it back and download it!</p>



<figure class="wp-block-image size-large"><img decoding="async" src="https://academy.finxter.com/wp-content/uploads/2024/01/3_subtitle_output-1024x609.png" alt="" class="wp-image-4060"/></figure>



<p>You can check the <code>commands_log.txt</code> file in the root directory to see all the commands that were run, and you can check the <code>output_temp_files</code> folder to see the temporary files that were created during the process, and the <code>output_video</code> folder to see the final output video file. If you need some extra quality, set a higher model like <code>small.en</code> or <code>medium.en</code>.</p>



<h2 class="wp-block-heading">Conclusion</h2>



<p>That&#8217;s pretty awesome! An automatic subtitler that will subtitle any video for you all on its own. You can build on this maybe by accepting YouTube links or adding translation functionality so you can have English subtitles on foreign language videos, which could be cool for language learning. Make sure you don&#8217;t use the <code>.en</code> model if you want to use other languages obviously.</p>



<p>To make a real production-grade application use a front-end framework and have some kind of progress or stream the live transcription to the page to stop the user getting bored, or allow them to do something else while the file processes in the background. A production app would have to run on a server with good processing power and GPU.</p>



<p>That&#8217;s it for part 3, I&#8217;ll see you soon in part 4 where we&#8217;ll look at ways to speed up Whisper or outsource the processing using the OpenAI API endpoint in the cloud. We&#8217;ll also build one more app using the cloud API to round off the series. See you there soon!</p>



<h2 class="wp-block-heading">Full Course: OpenAI Whisper &#8211; Building Cutting-Edge Python Apps with OpenAI Whisper</h2>



<p>Check out our full OpenAI Whisper course with video lessons, easy explanations, GitHub, and a downloadable PDF certificate to prove your speech processing skills to your employer and freelancing clients:</p>


<div class="wp-block-image">
<figure class="aligncenter size-full"><a href="https://academy.finxter.com/university/openai-whisper/"><img loading="lazy" decoding="async" width="908" height="257" src="https://blog.finxter.com/wp-content/uploads/2024/01/image-154.png" alt="" class="wp-image-1654506" srcset="https://blog.finxter.com/wp-content/uploads/2024/01/image-154.png 908w, https://blog.finxter.com/wp-content/uploads/2024/01/image-154-300x85.png 300w, https://blog.finxter.com/wp-content/uploads/2024/01/image-154-768x217.png 768w" sizes="auto, (max-width: 908px) 100vw, 908px" /></a></figure>
</div>


<p><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f449.png" alt="👉" class="wp-smiley" style="height: 1em; max-height: 1em;" /> [<strong>Academy</strong>] <a href="https://academy.finxter.com/university/openai-whisper/" data-type="link" data-id="https://academy.finxter.com/university/openai-whisper/">Voice-First Development: Building Cutting-Edge Python Apps Powered By OpenAI Whisper</a></p>
<p>The post <a href="https://blog.finxter.com/openai-whisper-example-building-a-subtitle-generator-embedder/">OpenAI Whisper Example &#8211; Building a Subtitle Generator &#038; Embedder</a> appeared first on <a href="https://blog.finxter.com">Be on the Right Side of Change</a>.</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>OpenAI Whisper Speech-to-Text (Course Overview)</title>
		<link>https://blog.finxter.com/openai-whisper-speech-to-text/</link>
		
		<dc:creator><![CDATA[Dirk van Meerveld]]></dc:creator>
		<pubDate>Thu, 25 Jan 2024 19:56:32 +0000</pubDate>
				<category><![CDATA[Artificial Intelligence]]></category>
		<category><![CDATA[Large Language Model (LLM)]]></category>
		<category><![CDATA[OpenAI]]></category>
		<category><![CDATA[Python]]></category>
		<guid isPermaLink="false">https://blog.finxter.com/?p=1654501</guid>

					<description><![CDATA[<p>Hi and welcome to this tutorial series on the OpenAI Whisper speech-to-text model. Whisper is a very powerful automatic speech recognition system and in this series, we&#8217;re going to learn all about it and create cool projects along the way. So I hope you&#8217;re excited to learn about Whisper and let&#8217;s get started!</p>
<p>The post <a href="https://blog.finxter.com/openai-whisper-speech-to-text/">OpenAI Whisper Speech-to-Text (Course Overview)</a> appeared first on <a href="https://blog.finxter.com">Be on the Right Side of Change</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<p>Hi and welcome to this tutorial series on the OpenAI Whisper speech-to-text model. Whisper is a very powerful automatic speech recognition system and in this series, we&#8217;re going to learn all about it and create cool projects along the way.</p>



<ul class="wp-block-list">
<li>In <strong><a href="https://blog.finxter.com/openai-whisper-python-installation-setup-first-steps-to-speech-to-text-synthesis/" data-type="link" data-id="https://blog.finxter.com/openai-whisper-python-installation-setup-first-steps-to-speech-to-text-synthesis/">part 1</a></strong> we&#8217;ll take a look at the basics of setting up and using the Whisper library to transcribe audio files on your local computer.</li>



<li>In the <strong><a href="https://blog.finxter.com/openai-whisper-building-a-podcast-transcribing-app-in-python/" data-type="link" data-id="https://blog.finxter.com/openai-whisper-building-a-podcast-transcribing-app-in-python/">next part</a></strong>, we are going to create a podcast application with a user interface where the user will be able to input any Google Podcasts link and they will get a transcript and summary of the podcast and even subtitle files for good measure.</li>



<li>In <strong><a href="https://blog.finxter.com/openai-whisper-example-building-a-subtitle-generator-embedder/" data-type="post" data-id="1654504">part 3</a></strong> we&#8217;ll look at dealing with transcribing video files by creating an application where the user inputs any video file and the output will be that same video file but with subtitles embedded in it.</li>



<li>Finally, in the <strong><a href="https://blog.finxter.com/openai-whisper-speeding-up-or-outsourcing-the-processing/" data-type="post" data-id="1654500">last part</a></strong>, we&#8217;ll take a look at alternatives, first looking at faster-whisper to speed things up, and then looking at using the Web-API version that runs in the cloud. We&#8217;ll create a final video-to-quiz application to show how the Web-API version works.</li>
</ul>



<p>So I hope you&#8217;re excited to learn about Whisper and let&#8217;s get started!</p>



<p></p>



<h2 class="wp-block-heading">Full Course: OpenAI Whisper &#8211; Building Cutting-Edge Python Apps with OpenAI Whisper</h2>



<p>Check out our full OpenAI Whisper course with video lessons, easy explanations, GitHub, and a downloadable PDF certificate to prove your speech processing skills to your employer and freelancing clients:</p>


<div class="wp-block-image">
<figure class="aligncenter size-full"><a href="https://academy.finxter.com/university/openai-whisper/"><img loading="lazy" decoding="async" width="908" height="257" src="https://blog.finxter.com/wp-content/uploads/2024/01/image-154.png" alt="" class="wp-image-1654506" srcset="https://blog.finxter.com/wp-content/uploads/2024/01/image-154.png 908w, https://blog.finxter.com/wp-content/uploads/2024/01/image-154-300x85.png 300w, https://blog.finxter.com/wp-content/uploads/2024/01/image-154-768x217.png 768w" sizes="auto, (max-width: 908px) 100vw, 908px" /></a></figure>
</div>


<p><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f449.png" alt="👉" class="wp-smiley" style="height: 1em; max-height: 1em;" /> [<strong>Academy</strong>] <a href="https://academy.finxter.com/university/openai-whisper/" data-type="link" data-id="https://academy.finxter.com/university/openai-whisper/">Voice-First Development: Building Cutting-Edge Python Apps Powered By OpenAI Whisper</a></p>
<p>The post <a href="https://blog.finxter.com/openai-whisper-speech-to-text/">OpenAI Whisper Speech-to-Text (Course Overview)</a> appeared first on <a href="https://blog.finxter.com">Be on the Right Side of Change</a>.</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>OpenAI Whisper &#8211; Building a Podcast Transcribing App in Python</title>
		<link>https://blog.finxter.com/openai-whisper-building-a-podcast-transcribing-app-in-python/</link>
		
		<dc:creator><![CDATA[Dirk van Meerveld]]></dc:creator>
		<pubDate>Thu, 25 Jan 2024 19:56:17 +0000</pubDate>
				<category><![CDATA[Artificial Intelligence]]></category>
		<category><![CDATA[Large Language Model (LLM)]]></category>
		<category><![CDATA[OpenAI]]></category>
		<category><![CDATA[Speech Recognition and Generation]]></category>
		<guid isPermaLink="false">https://blog.finxter.com/?p=1654503</guid>

					<description><![CDATA[<p>🎙️ Course: This article is based on a lesson from our Finxter Academy Course Voice-First Development: Building Cutting-Edge Python Apps Powered By OpenAI Whisper. Check it out for video lessons, GitHub, and a downloadable PDF course certificate with your name on it! Welcome back to part 2, where we&#8217;ll start practically applying our Whisper skills ... <a title="OpenAI Whisper &#8211; Building a Podcast Transcribing App in Python" class="read-more" href="https://blog.finxter.com/openai-whisper-building-a-podcast-transcribing-app-in-python/" aria-label="Read more about OpenAI Whisper &#8211; Building a Podcast Transcribing App in Python">Read more</a></p>
<p>The post <a href="https://blog.finxter.com/openai-whisper-building-a-podcast-transcribing-app-in-python/">OpenAI Whisper &#8211; Building a Podcast Transcribing App in Python</a> appeared first on <a href="https://blog.finxter.com">Be on the Right Side of Change</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<p class="has-base-2-background-color has-background"><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f399.png" alt="🎙" class="wp-smiley" style="height: 1em; max-height: 1em;" /> <strong>Course</strong>: This article is based on a lesson from our <strong>Finxter Academy Course</strong> <a href="https://academy.finxter.com/university/openai-whisper/"><em>Voice-First Development: Building Cutting-Edge Python Apps Powered By OpenAI Whisper</em></a>. Check it out for video lessons, GitHub, and a downloadable PDF course certificate with your name on it!</p>



<p>Welcome back to part 2, where we&#8217;ll start practically applying our Whisper skills to build useful stuff. We obviously cannot just rely on the user needing to give us MP3 files to transcribe, they may want to just link a podcast for example. Here, we&#8217;ll be building a real application that can transcribe podcasts to text or subtitle format by taking just a podcast link as input.</p>



<p>Before we get started on the main code, we&#8217;ll do some basic setup work and create the helper functions we need to run in our main code. Keeping things separated across multiple functions and files will help keep our code a lot more clean and readable compared to just having one big script that does everything at the same time.</p>



<h2 class="wp-block-heading">Saving our constants to a separate file</h2>



<p>First, there are a couple of settings we&#8217;ll be using again and again over the next three parts, namely the paths to the input and output folders for the mp3 files, subtitles, and whatever else we will be processing. Instead of importing <code>pathlib</code> in every single file and then writing <code>BASE_DIR = Path(__file__).parent</code> we&#8217;ll just write this in a separate file and import it everywhere we need it. This will also make it easier to change the paths later if we need to.</p>



<p>In your project folder create a new file called <code>settings.py</code>, making sure to put it in the root folder of your project:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />FINX_WHISPER (project root folder)
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />test_audio_files
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />1_basic_call_english_only.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />1_multiple_languages.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />settings.py</pre>



<p>In <code>settings.py</code>, write the following code:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">from pathlib import Path

BASE_DIR = Path(__file__).parent
OUTPUT_TEMP_DIR = BASE_DIR / "output_temp_files"
OUTPUT_VIDEO_DIR = BASE_DIR / "output_video"
STYLES_DIR = BASE_DIR / "styles"
TEST_AUDIO_DIR = BASE_DIR / "test_audio_files"</pre>



<p>We first get the root directory of the project using <code>Path(__file__).parent</code>, and then we create a few more paths relative to the root directory. We&#8217;ll use these paths in our main code to save the output files to the correct folders. Go ahead and also create empty folders for the <code>output_temp_files</code>, <code>output_video</code>, and <code>styles</code> folders, making sure to spell them correctly:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />FINX_WHISPER (project root folder)
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />output_temp_files     (new empty folder)
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />output_video          (new empty folder)
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />styles                (new empty folder)
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />test_audio_files      (already existing folder)
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />1_basic_call_english_only.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />1_multiple_languages.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />settings.py</pre>



<p>That&#8217;s our folders and paths setup done. We can just import these variables to access the folders from any file in our project. There is one more <code>setting</code> we need to define, but we cannot hardcode this one in our source code. We need to get our API key for OpenAI, as we&#8217;ll be using some ChatGPT in this part of the course. You&#8217;ll also need your API key for later parts. Go to https://platform.openai.com/api-keys and copy your API key. If you don&#8217;t have one, make sure to get one. You&#8217;ll only pay for what you use which will be cents if you just play around with it casually. Then create a new file called <code>.env</code> in the root folder of your project:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />FINX_WHISPER (project root folder)
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />output_temp_files
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />output_video
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />styles
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />test_audio_files
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />1_basic_call_english_only.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />1_multiple_languages.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />settings.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />.env                  (new file)</pre>



<p>And paste your API key in there like this, making sure not to use any spaces or quotes:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">OPENAI_API_KEY=your_api_key_here</pre>



<p>Then go ahead and save and close this file.</p>



<h2 class="wp-block-heading">Creating a utils folder for our helper functions</h2>



<p>Now let&#8217;s create a new folder named <code>utils</code> to hold our helper functions, and then inside this new folder create an empty file called <code>__init__.py</code>:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />FINX_WHISPER (project root folder)
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />output_temp_files
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />output_video
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />styles
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />test_audio_files
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />utils                 (new folder)
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />__init__.py       (new empty file)
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />1_basic_call_english_only.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />1_multiple_languages.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />settings.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />.env</pre>



<p>The <code>__init__.py</code> file is required to make Python treat the <code>utils</code> folder as a package, which will allow us to import the functions from within our other files. You don&#8217;t need to write anything in this file, just create it and leave it empty.</p>



<p>Our first utils file will deal with the podcast-related functions, so create a file called <code>podcast.py</code> in the <code>utils</code> folder:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />FINX_WHISPER (project root folder)
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />output_temp_files
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />output_video
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />styles
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />test_audio_files
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />utils
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />__init__.py
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />podcast.py        (new file)
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />1_basic_call_english_only.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />1_multiple_languages.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />settings.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />.env</pre>



<p>Inside <code>podcast.py</code> get started with our imports:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">import re
import uuid
from pathlib import Path

import requests
from decouple import config
from openai import OpenAI</pre>



<p>The <code>re</code> library deals with regular expressions and will help us find the podcast download page link amongst the page text. The <code>uuid</code> library lets us generate unique id&#8217;s, <code>pathlib</code> is familiar to us by now, and <code>requests</code> will help us download the podcast mp3 file. <code>decouple</code> will help us read our API key from the <code>.env</code> file, and <code>openai</code> will help us use the OpenAI API. If you have not used <code>decouple</code> before, make sure you run the install command in your terminal:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">pip install python-decouple</pre>



<p>Back in <code>podcast.py</code> let&#8217;s create a few constants that we&#8217;ll be using in our functions:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">GPT_MODEL = "gpt-3.5-turbo-1106"
CLIENT = OpenAI(api_key=str(config("OPENAI_API_KEY")))</pre>



<p>First, we set the ChatGPT model we&#8217;ll be using to request a podcast summary later on. Then we create a <code>CLIENT</code> object that we&#8217;ll use to make requests to the OpenAI API. We pass in our API key as a string, and we use <code>config</code> to read the API key from the <code>.env</code> file. Note that <code>config("OPENAI_API_KEY")</code> already returns a string value, the <code>str()</code> call surrounding it is just there to make it explicit and will not convert values that are already strings to a string again for the second time or something weird like that.</p>



<h2 class="wp-block-heading">Scraping the podcast download link from the podcast page</h2>



<p>So what are some of the functions we&#8217;ll need in here? For this example application I will be using <code>Google Podcasts</code> as our podcast source. This means we will get an input link like this:<br>https://podcasts.google.com/feed/aHR0cDovL2ZlZWRzLmZlZWRidXJuZXIuY29tL1RFRF9BaGFfQnVzaW5lc3M/episode/ZW4udmlkZW8udGFsay50ZWQuY29tOjExMTk3MDo4MA?sa=X&amp;ved=0CAgQuIEEahcKEwiIzMnavduDAxUAAAAAHQAAAAAQAQ</p>



<p>If you load this page in your browser, you will see an HTML page, with a play button. This is the kind of page link the user will input into our app, so first of all we will need a function to extract the <code>.mp3</code> download link from this page&#8217;s HTML.</p>



<p>Let&#8217;s get started on a function to do exactly that:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">def scrape_link_from_page(page_url: str) -> str:
    podcast_page = requests.get(page_url).text
    regex = r"(?P&lt;url>\;https?://[^\s]+)"
    ...</pre>



<p>We start by defining our function which takes the <code>page_url</code> as a string and will return a string value as well. Then we use <code>requests</code> to get the HTML page text by sending a <code>GET</code> request to the URL, much like your internet browser would if you type a URL in the address bar. Now we define a regular expression that will match the pattern of the download link we want to extract. We&#8217;ll use this regex to find the download link in the HTML page text. Here&#8217;s how it works:</p>



<ul class="wp-block-list">
<li><code>(?P&lt;url&gt;...)</code> This is a named group. The matched text can be retrieved by the name URL. So basically the URL pattern we will find will be stored in a variable called URL.</li>



<li><code>\;</code> This matches a semicolon character. The backslash is used to escape the semicolon, as it has special meaning in regular expressions. We don&#8217;t want this special meaning but the literal semicolon character, as there is a semicolon in front of the https that we want to match for the URL we need. (This is just a characteristic of this particular podcast page, other pages might have different patterns.)</li>



<li><code>https?</code> This matches either http or https. The s? means &#8220;match zero or one s characters&#8221;. This allows the regex to match both http and https.</li>



<li><code>://</code> This matches the string ://, which is part of the standard format for URLs.</li>



<li><code>[^\s]+</code> This matches one or more (<code>+</code>) of any character that is not (<code>^</code>) a whitespace (<code>\s</code>) character. So basically this will match any character that is not a space, tab, or newline character. This will match the rest of the URL we need and stop adding characters as soon as a space appears which indicates the end of the URL.</li>
</ul>



<p>So, in simple terms, this regular expression matches a semicolon followed by a URL that starts with either http or https, and continues until a whitespace character is encountered. The URL is captured in a group named url.</p>



<p>Now let&#8217;s complete our function:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">def scrape_link_from_page(page_url: str) -> str:
    podcast_page = requests.get(page_url).text
    regex = r"(?P&lt;url>\;https?://[^\s]+)"
    podcast_url_dirty = re.findall(regex, podcast_page)[0]
    podcast_url = podcast_url_dirty.split(";")[1]
    return podcast_url</pre>



<p>So after we declared the regex pattern, we use <code>re.findall</code> to find all matches of the pattern in the podcast page text. This will return a list of matches, and we take the first match with <code>[0]</code>. This will return a string that looks something like this:</p>



<p><code>;https://download.ted.com/talks/etcetcetc;</code></p>



<p>Which is pretty good, we just need to get rid of the <code>;</code> characters before and after the URL. We do this by splitting the string on the <code>;</code> character, and then taking the second item in the list with <code>[1]</code>. This will return the clean URL we need: https://download.ted.com/talks/etcetcetc</p>



<h2 class="wp-block-heading">Downloading the podcast mp3 file</h2>



<p>Ok, so now our utils file has a function to scrape the download link. It stands to reason we&#8217;ll also need a function to download the mp3 file from the URL. Let&#8217;s get started on that:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">def download(podcast_url: str, unique_id: uuid.UUID, output_dir: Path) -> Path:
    print("Downloading podcast...")
    podcast_audio = requests.get(podcast_url)
    save_location = output_dir / f"{unique_id}.mp3"
    ...</pre>



<p>We define a function called <code>download</code> that takes 3 input arguments. The <code>podcast_url</code> is the URL we scraped from the podcast page as a string variable. The <code>unique_id</code> is a unique ID we&#8217;ll use to name the downloaded file, so we can avoid name clashes where files have the same name. This argument should be an instance of the <code>UUID</code> class from the <code>uuid</code> built-in Python library, which we&#8217;ll have a look at in a bit. The <code>output_dir</code> is the directory where we want to save the downloaded file as a <code>Path</code> object. Finally, our function will also return a <code>Path</code> object, which will be the path to the downloaded file.</p>



<p>We print a simple message to the console to show it is busy actually doing something, and then we use <code>requests</code> to download the podcast audio file by sending a <code>GET</code> request to the URL just like we did in the previous function. Then we create a <code>save_location</code> variable which is the path to the file we want to save. We use the <code>output_dir</code> argument as the parent directory, and then we use an f-string to create a filename that is the <code>unique_id</code> followed by the <code>.mp3</code> extension.</p>



<p>Now let&#8217;s complete our function:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">def download(podcast_url: str, unique_id: uuid.UUID, output_dir: Path) -> Path:
    print("Downloading podcast...")
    podcast_audio = requests.get(podcast_url)
    save_location = output_dir / f"{unique_id}.mp3"

    with open(save_location, "wb") as file:
        file.write(podcast_audio.content)
    print("Podcast successfully downloaded!")

    return save_location</pre>



<p>We use the <code>open</code> function to open the <code>save_location</code> file in write binary (<code>wb</code>) mode, and we write the <code>podcast_audio.content</code> to the file. This will save the podcast audio file to the <code>save_location</code> path. Then we print a message to the console to show the download was successful, and we return the <code>save_location</code> path which points to the mp3 file we just downloaded, awesome!</p>



<h2 class="wp-block-heading">Getting a summary</h2>



<p>Now there is one more function we need in our <code>utils/podcast</code> file. Besides just the transcription, we will also provide the user with a summary of the podcast. We&#8217;ll use ChatGPT to generate this summary, so we&#8217;ll need a simple function to do that. This one will be easy, so let&#8217;s just whip it up:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">def get_summary(transcription: str) -> str:
    print("Summarizing podcast...")
    prompt = f"Summarize the following podcast into the most important points:\n\n{transcription}\n\nSummary:"

    response = CLIENT.chat.completions.create(
        model=GPT_MODEL, messages=[{"role": "user", "content": prompt}]
    )

    print("Podcast summarized!")
    summary = response.choices[0].message.content
    return summary if summary else "There was a problem generating the summary."</pre>



<p>I assume you&#8217;re familiar with ChatGPT (if not, check out my other courses on the Finxter Academy!). We just have a simple function that takes the full <code>transcription</code> as a string and will return a summary as a string. We have a console print message again just to keep ourselves posted that it is doing some work and then we have a simple ChatGPT prompt.</p>



<p>Note the prompt ends with <code>Summary:</code> to prompt the model to start the summary right away without including any awkward introduction text, this is just a neat little trick you can use. We then use our <code>CLIENT</code> object to call the <code>chat.completions.create</code> endpoint, passing in the <code>GPT_MODEL</code> and a list of messages. We&#8217;ll just pass in the prompt as a user message. We then extract the <code>summary</code> from the <code>response.choices[0].message.content</code>. Just in case there was a problem and the summary is empty, we return a default message to inform the user.</p>



<h2 class="wp-block-heading">Subtitles</h2>



<p>Awesome! Our <code>podcast</code> utils are done now. Let&#8217;s move on to the <code>subtitles</code> utils. This one will be a much shorter file with a function that will allow us to output the transcription in subtitle format, with timestamps and everything. So go ahead and create a new file called <code>subtitles.py</code> in the <code>utils</code> folder:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />FINX_WHISPER (project root folder)
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />output_temp_files
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />output_video
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />styles
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />test_audio_files
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />utils
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />__init__.py
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />podcast.py
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />subtitles.py      (new file)
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />1_basic_call_english_only.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />1_multiple_languages.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />settings.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />.env</pre>



<p>And inside <code>subtitles.py</code> get started with our imports:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">from typing import Callable
from pathlib import Path</pre>



<p>Both of these imports will be used solely to indicate the type of our function arguments (type hinting). We&#8217;ll use <code>Callable</code> to indicate that a function is expected as an argument, and we&#8217;ll use <code>Path</code> to indicate that a <code>Path</code> object is expected as an argument. This just makes our code clearer to read and easier to understand. Now let&#8217;s write our function, whose purpose will be to take a transcription done by Whisper and then convert it to a valid subtitle file:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">def write_to_file(whisper_output: dict, writer: Callable, output_path: Path) -> Path:
    """Takes the whisper output, a writer function, and an output path, and writes subtitles to disk in the specified format."""
    with open(output_path, "w", encoding="utf-8") as sub_file:
        writer.write_result(result=whisper_output, file=sub_file)
        print(f"Subtitles generated and saved to {output_path}")

    return output_path</pre>



<p>We take a <code>whisper_output</code> argument which is a dictionary containing the output Whisper gives us after we transcribe the podcast&#8217;s mp3 file. We also take a <code>writer</code> argument which is a function that will write the subtitles to disk, so we type-hint it with <code>Callable</code>. Finally, we take an <code>output_path</code> argument which is a <code>Path</code> object to the file we want to save the subtitles to. We then simply open the output path in write mode, calling the file <code>sub_file</code>. We then call the <code>writer.write_result</code> function, passing in the <code>whisper_output</code> and the location to save the subtitles to. Finally, we print a message to the console to show the subtitles were generated successfully, and we return the <code>output_path</code> which is the path to the subtitle file we just created.</p>



<p>Two important things to note here:</p>



<ul class="wp-block-list">
<li>When you open the subtitle file, make sure you use the <code>encoding="utf-8"</code> argument. For normal English characters, this is not necessary, so you might think this is not needed. However, the AI likes to use ♪ symbols when music starts playing to make the subtitles more interesting, and you crash if you don&#8217;t specify utf-8 encoding which can actually map and save these special characters!</li>



<li>You might be wondering what this magical <code>writer</code> function is. Whisper actually comes with some utility functions that will allow us to write subtitles in correct formatting, like <code>SRT</code> or <code>VTT</code>. These utilities have a <code>.write_result</code> function which is what we&#8217;re calling in our code above. So we&#8217;ll be able to pass in a SRT-writer or a VTT-writer depending on what subtitle type we want to save.</li>
</ul>



<p>Ok, so that is all our utility functions done. Now let&#8217;s move on to the main code.</p>



<h2 class="wp-block-heading">Installing gradio</h2>



<p>Before we get started you&#8217;ll need to install <code>gradio</code>, so in your terminal window, run:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">pip install gradio</pre>



<p>What is <code>gradio</code>? Gradio is a Python library that allows us to quickly create user-friendly interfaces for testing, demonstrating, and debugging machine learning models. We&#8217;ll use gradio to create a UI for our app with just a few lines of code, and it supports a wide range of input and output types like video, audio, and text. Using this super simple framework we can keep the focus on whisper and not on building a user interface. It&#8217;s pretty self-explanatory, so you&#8217;ll understand the idea as we just code along.</p>



<h2 class="wp-block-heading">Creating the main file</h2>



<p>Now let&#8217;s get started on our main code, where mostly we&#8217;ll just have to call our utility functions and tie it all together, plus create a quick gradio interface to make it user-friendly. Create a new file called <code>2_whisper_pods.py</code> in the root folder of your project:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />FINX_WHISPER (project root folder)
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />output_temp_files
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />output_video
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />styles
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />test_audio_files
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />utils
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />__init__.py
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />podcast.py
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />subtitles.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />1_basic_call_english_only.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />1_multiple_languages.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />2_whisper_pods.py   (new file)
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />settings.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />.env</pre>



<p>And inside <code>2_whisper_pods.py</code> get started with our imports:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">import uuid
from pathlib import Path

import gradio as gr
import whisper
from whisper.utils import WriteSRT, WriteVTT

from settings import BASE_DIR, OUTPUT_TEMP_DIR, STYLES_DIR
from utils import podcast, subtitles</pre>



<p><code>uuid</code> is Python&#8217;s built-in library to generate unique id&#8217;s, <code>pathlib</code> is familiar to us by now, and <code>gradio</code> is the library we just installed. We also import <code>whisper</code> and two writer utilities from <code>whisper.utils</code>, which are the writer functions we talked about in the previous section. Then we import our directory <code>Path</code> constants from the <code>settings</code> and our <code>podcast</code> and <code>subtitles</code> utils. Now continue below the imports:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">WHISPER_MODEL = whisper.load_model("base")
VTT_WRITER = WriteVTT(output_dir=str(OUTPUT_TEMP_DIR))
SRT_WRITER = WriteSRT(output_dir=str(OUTPUT_TEMP_DIR))</pre>



<p>We load the <code>WHISPER_MODEL</code> from the <code>base</code> model, and we create two writer objects by creating instances of the <code>WriteVTT</code> and <code>WriteSRT</code> classes we imported from Whisper&#8217;s utilities, passing in the <code>output_dir</code> as a string.</p>



<p>Now let&#8217;s create a function to tie it all together:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">def transcribe_and_summarize(page_link: str) -> tuple[str, str, str, str]:
    unique_id = uuid.uuid4()

    podcast_download_url = podcast.scrape_link_from_page(page_link)
    mp3_file: Path = podcast.download(podcast_download_url, unique_id, OUTPUT_TEMP_DIR)
    ...</pre>



<p>We define a function called <code>transcribe_and_summarize</code> which takes a <code>page_link</code> as a string and will return a tuple so we can have multiple outputs to this function. These four outputs will feed back into the gradio interface we will create later and will be:</p>



<ul class="wp-block-list">
<li>The podcast summary</li>



<li>The podcast transcription</li>



<li>The VTT subtitle file (path)</li>



<li>The SRT subtitle file (path)</li>
</ul>



<p>We then create a new <code>unique_id</code> which we&#8217;ll use to name the downloaded mp3 file. Note we do this inside the function as we need a unique identifier for every single transcription run to avoid name clashes. Then we use our <code>podcast.scrape_link_from_page</code> util to scrape the download link from the podcast page, and we use our <code>podcast.download</code> function to download the podcast mp3 file, passing in the <code>podcast_download_url</code>, <code>unique_id</code>, and the <code>OUTPUT_TEMP_DIR</code> as arguments. We then catch the mp3 file path in a variable called <code>mp3_file</code>. Notice how easy everything is to read because we used logical and descriptive names for all our variables and utility functions and files.</p>



<p>Let&#8217;s continue with our function:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">def transcribe_and_summarize(page_link: str) -> tuple[str, str, str, str]:
    ...previous code...

    whisper_output = WHISPER_MODEL.transcribe(str(mp3_file))
    with open(BASE_DIR / "pods_log.txt", "w", encoding="utf-8") as f:
        f.write(str(whisper_output))

    transcription = str(whisper_output["text"])
    summary = podcast.get_summary(transcription)</pre>



<p>We call the <code>.transcribe</code> function by passing in the <code>mp3_file</code> path as a string. This will return a dictionary with the transcription and other information we catch in <code>whisper_output</code>. We then open a file called <code>pods_log.txt</code> in our root directory in write mode, and we write the <code>whisper_output</code> to the file. This is just for debugging purposes, so we can see what the output looks like (it&#8217;s too long to print to the console). We then extract the <code>transcription</code> from the <code>whisper_output</code> dictionary. Note that <code>whisper_output["text"]</code> is already a string, the reason we wrapped inside a <code>str()</code> call is just to make it explicit that this is a string for typing purposes. This will not add any extra overhead or computing time as values that are already a string will just pass through the <code>str()</code> function unaltered. Then we call our <code>podcast.get_summary</code> function, passing in the <code>transcription</code> as an argument.</p>



<p>Now we just need to write the subtitles to disk and return all the outputs. Continue on:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">def transcribe_and_summarize(page_link: str) -> tuple[str, str, str, str]:
    ...previous code...

    get_sub_path = lambda ext: OUTPUT_TEMP_DIR / f"{unique_id}{ext}"
    vtt_subs = subtitles.write_to_file(whisper_output, VTT_WRITER, get_sub_path(".vtt"))
    srt_subs = subtitles.write_to_file(whisper_output, SRT_WRITER, get_sub_path(".srt"))

    return (summary, transcription, str(vtt_subs), str(srt_subs))</pre>



<p>We create a lambda (nameless) function that takes a file extension as input and then returns the path to the subtitle file with that extension. For example, inputting <code>.vtt</code> will yield <code>output_temp_files/unique_id.vtt</code>, but giving it <code>.srt</code> will yield <code>output_temp_files/unique_id.srt</code>, just so we can avoid repeating the same code twice. Then we call our <code>subtitles.write_to_file</code> function twice, passing in the <code>whisper_output</code>, the <code>VTT_WRITER</code> and <code>SRT_WRITER</code> writer functions, and the <code>get_sub_path</code> lambda function to get the path to the subtitle file. We catch the output of these two functions in <code>vtt_subs</code> and <code>srt_subs</code> respectively. Finally, we return a tuple containing the <code>summary</code>, <code>transcription</code>, <code>vtt_subs</code>, and <code>srt_subs</code> to finish off our function.</p>



<p>The whole thing now looks like this:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">def transcribe_and_summarize(page_link: str) -> tuple[str, str, str, str]:
    unique_id = uuid.uuid4()

    podcast_download_url = podcast.scrape_link_from_page(page_link)
    mp3_file: Path = podcast.download(podcast_download_url, unique_id, OUTPUT_TEMP_DIR)

    whisper_output = WHISPER_MODEL.transcribe(str(mp3_file))
    with open(BASE_DIR / "pods_log.txt", "w", encoding="utf-8") as f:
        f.write(str(whisper_output))

    transcription = str(whisper_output["text"])
    summary = podcast.get_summary(transcription)

    get_sub_path = lambda ext: OUTPUT_TEMP_DIR / f"{unique_id}{ext}"
    vtt_subs = subtitles.write_to_file(whisper_output, VTT_WRITER, get_sub_path(".vtt"))
    srt_subs = subtitles.write_to_file(whisper_output, SRT_WRITER, get_sub_path(".srt"))

    return (summary, transcription, str(vtt_subs), str(srt_subs))</pre>



<h2 class="wp-block-heading">Creating the gradio interface</h2>



<p>That&#8217;s very nice and well, but a typical end user does not know how to use Python and this function is not very user-friendly. So let&#8217;s create a quick gradio interface to make it easy for the user to use our app. Continue below the function:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">if __name__ == "__main__":
    block = gr.Blocks(css=str(STYLES_DIR / "whisper_pods.css"))

    with block:
        with gr.Group():
            # Header

            # Input textbox for podcast link

            # Button to start transcription

            # Output elements

            # btn.click definition

    block.launch(debug=True)</pre>



<p>This is going to be the basic structure of our <code>gradio</code> application. First, we use <code>if __name__ == "__main__":</code> to make sure the code inside this block only runs if we run this file directly, and not if we import it from another file. Then we create a <code>block</code> object by calling <code>gr.Blocks</code> and passing in the path to our <code>whisper_pods.css</code> file in the <code>styles</code> directory as a string. This will allow us to style our app with CSS, which we&#8217;ll do in a bit (this .css file doesn&#8217;t exist yet). Then we open a <code>with block:</code> block, and inside this block we open a <code>with gr.Group():</code> block. This will allow us to group elements together in our app. Then we have a bunch of comments to indicate what we&#8217;ll be doing in each block, which we&#8217;ll fill in in a moment. Finally, we call <code>block.launch</code> to launch our app, passing in <code>debug=True</code> so we get extra feedback in the console if anything goes wrong.</p>



<ul class="wp-block-list">
<li>The header will hold a logo image for our application. We&#8217;ll use HTML to load it from the internet. We can call <code>gr.HTML</code> to create an HTML element, and we can pass in the HTML code as a string. We&#8217;ll use a <code>div</code> element with a <code>header</code> class, and inside this <code>div</code> we&#8217;ll have an <code>img</code> element with a link to our logo image, which I just quickly uploaded to &#8220;imgur&#8221;. We&#8217;ll also set the <code>referrerpolicy</code> to <code>no-referrer</code> to avoid any issues with the image not loading (imgur doesn&#8217;t work with a <code>localhost</code> referrer, which is what you&#8217;ll have when you run this app locally).</li>
</ul>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">gr.HTML(
    f"""
    &lt;div class="header">
    &lt;img src="https://i.imgur.com/8Xu2rwG.png" referrerpolicy="no-referrer" />
    &lt;/div>
    """
)</pre>



<ul class="wp-block-list">
<li>The input textbox will be where the user can paste in the podcast link. We can just call <code>gr.Textbox</code> to create a textbox element, and we can pass in a label to indicate what the textbox is for. We&#8217;ll call it &#8220;Google Podcasts Link&#8221; and we&#8217;ll catch the input in a variable called <code>podcast_link_input</code>.</li>
</ul>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">podcast_link_input = gr.Textbox(label="Google Podcasts Link:")</pre>



<ul class="wp-block-list">
<li>The button will be the trigger that starts the main function. I want a full row button so we&#8217;ll call <code>gr.Row</code> to create a row element, and then we&#8217;ll call <code>gr.Button</code> to create a button element. We can just pass in the button text we want to display and associate the button with the variable name <code>btn</code>. We&#8217;ll use this <code>btn</code> object later to define the button&#8217;s behavior.</li>
</ul>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">with gr.Row():
    btn = gr.Button("<img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f399.png" alt="🎙" class="wp-smiley" style="height: 1em; max-height: 1em;" /> Transcribe and summarize my podcast! <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f399.png" alt="🎙" class="wp-smiley" style="height: 1em; max-height: 1em;" />")</pre>



<ul class="wp-block-list">
<li>The output elements will be the summary, transcription, and two subtitle files. The first two are just a <code>gr.Textbox</code> which does what you&#8217;d expect and allows us to pass in a label, placeholder, and the number of lines to display by default. The <code>autoscroll</code> behavior will scroll all the way down to the bottom if a large transcription text is passed into the input box. Since we want the user to be able to start reading from the beginning instead of the end we set this behavior to <code>False</code>. We then have another <code>gr.Row</code> with two <code>gr.File</code> elements which will end up side-by-side in a single row. The <code>label</code> is just a label and the <code>elem_classes</code> is a list of classes gradio will give the element, so we can target it with CSS later on using the names <code>vtt-sub-file</code> and <code>srt-sub-file</code>.</li>
</ul>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">summary_output = gr.Textbox(
    label="Podcast Summary",
    placeholder="Podcast Summary",
    lines=4,
    autoscroll=False,
)

transcription_output = gr.Textbox(
    label="Podcast Transcription",
    placeholder="Podcast Transcription",
    lines=8,
    autoscroll=False,
)

with gr.Row():
    vtt_sub_output = gr.File(
        label="VTT Subtitle file download", elem_classes=["vtt-sub-file"]
    )
    srt_sub_output = gr.File(
        label="SRT Subtitle file download", elem_classes=["srt-sub-file"]
    )</pre>



<ul class="wp-block-list">
<li>The <code>btn.click</code> is where we define which function to call when the button is clicked, so we give it our <code>transcribe_and_summarize</code> function as the first argument. The second argument is a list of inputs, in this case only our <code>podcast_link_input</code>. The third argument is a list of outputs, in this case, our <code>summary_output</code>, <code>transcription_output</code>, <code>vtt_sub_output</code>, and <code>srt_sub_output</code>. We&#8217;ll use these outputs to display the results of our function to the user. We just told gradio what function to run, and how to map all of the input and output elements we defined in the interface to the input and output arguments of our function!</li>
</ul>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">btn.click(
    transcribe_and_summarize,
    inputs=[podcast_link_input],
    outputs=[
        summary_output,
        transcription_output,
        vtt_sub_output,
        srt_sub_output,
    ],
)</pre>



<p><code>whisper_pods.py</code> now looks like this:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">imports

CONSTANTS


def transcribe_and_summarize(...)...
    ...


if __name__ == "__main__":
    block = gr.Blocks(css=str(STYLES_DIR / "whisper_pods.css"))

    with block:
        with gr.Group():
            gr.HTML(
                f"""
                &lt;div class="header">
                &lt;img src="https://i.imgur.com/8Xu2rwG.png" referrerpolicy="no-referrer" />
                &lt;/div>
                """
            )

            podcast_link_input = gr.Textbox(label="Google Podcasts Link:")

            with gr.Row():
                btn = gr.Button("<img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f399.png" alt="🎙" class="wp-smiley" style="height: 1em; max-height: 1em;" /> Transcribe and summarize my podcast! <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f399.png" alt="🎙" class="wp-smiley" style="height: 1em; max-height: 1em;" />")

            summary_output = gr.Textbox(
                label="Podcast Summary",
                placeholder="Podcast Summary",
                lines=4,
                autoscroll=False,
            )

            transcription_output = gr.Textbox(
                label="Podcast Transcription",
                placeholder="Podcast Transcription",
                lines=8,
                autoscroll=False,
            )

            with gr.Row():
                vtt_sub_output = gr.File(
                    label="VTT Subtitle file download", elem_classes=["vtt-sub-file"]
                )
                srt_sub_output = gr.File(
                    label="SRT Subtitle file download", elem_classes=["srt-sub-file"]
                )

            btn.click(
                transcribe_and_summarize,
                inputs=[podcast_link_input],
                outputs=[
                    summary_output,
                    transcription_output,
                    vtt_sub_output,
                    srt_sub_output,
                ],
            )

    block.launch(debug=True)</pre>



<h2 class="wp-block-heading">Creating the CSS file</h2>



<p>See how easy it was to write an interface using gradio! There is just one thing left to do, the <code>STYLES_DIR / "whisper_pods.css"</code> file we loaded into gradio doesn&#8217;t actually exist! Go ahead and create a new file in the <code>styles</code> directory called <code>whisper_pods.css</code>:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />FINX_WHISPER (project root folder)
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />output_temp_files
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />output_video
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />styles
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />whisper_pods.css  (new file)
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />test_audio_files
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />utils
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />__init__.py
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />podcast.py
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />subtitles.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />1_basic_call_english_only.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />1_multiple_languages.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />2_whisper_pods.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />settings.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />.env</pre>



<p>Inside <code>whisper_pods.css</code> paste the following code:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">.header {
  padding: 2em 8em;
}

.vtt-sub-file,
.srt-sub-file {
  height: 80px;
}</pre>



<p>We set some padding on the header image by targeting the <code>header</code> class, to stop the image from getting too big. Then we set the height of the subtitle file download boxes to 80px, so they don&#8217;t get smaller than this, keeping them nice and visible.</p>



<p>Now go back to your <code>2_whisper_pods.py</code> file and run it. Give it some time to load up and you&#8217;ll see the following in your terminal:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">Running on local URL:  http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.</pre>



<p>CTRL + click the link to open it in your browser. You should see the following:</p>


<div class="wp-block-image">
<figure class="aligncenter size-large"><img decoding="async" src="https://academy.finxter.com/wp-content/uploads/2024/01/2_gradio_interface-1024x877.png" alt="" class="wp-image-4056"/></figure>
</div>


<p>Go ahead and get a Google podcasts link to input. I&#8217;ll use a short podcast just for the initial test:<br>https://podcasts.google.com/feed/aHR0cDovL2ZlZWRzLmZlZWRidXJuZXIuY29tL1RFRF9BaGFfQnVzaW5lc3M/episode/ZW4udmlkZW8udGFsay50ZWQuY29tOjEwNzMyNDo4MA?sa=X&amp;ved=0CAgQuIEEahcKEwiImYLqr8qDAxUAAAAAHQAAAAAQAQ</p>



<p>And then click the button and wait (I&#8217;ve blurred out the transcription to respect the speaker&#8217;s copyright as this course will be published publicly):</p>


<div class="wp-block-image">
<figure class="aligncenter size-large"><img decoding="async" src="https://academy.finxter.com/wp-content/uploads/2024/01/2_gradio_output-974x1024.png" alt="" class="wp-image-4055"/></figure>
</div>


<p>Check the summary, transcription, and subtitle files. Try other podcasts from https://podcasts.google.com/. play around and have fun! My transcription was very good using just the <code>base</code> whisper model we loaded up and I never even used a bigger one! If you use non-English languages you may need a bigger model though. You can also use a <code>.en</code> model like <code>base.en</code> or <code>small.en</code> to get higher accuracy if you will only input English podcasts.</p>



<p>Also take a look at the <code>pods_log.txt</code> file you wrote in the root directory of your project, which holds the full whisper output. It may help you pinpoint where the problems are and how confident the model is while transcribing.</p>



<h2 class="wp-block-heading">Conclusion</h2>



<p>There we go, that is a pretty good initial minimum viable product! Of course, it has much room for improvement, for instance by using a proper front-end framework like React and streaming the transcription live to the page so the user is not left waiting so long before seeing results.</p>



<p>You could also use asyncio to make the ChatGPT summary call asynchronous slightly speeding up the code by writing the subtitle files to disk while the ChatGPT summary call is running at the same time, and of course, you&#8217;d want some kind of cleanup function to get rid of all the downloaded mp3 files hanging around in your <code>output_temp_files</code> folder. If you check it you will see all the files with the names like <code>0e0f5d05-9379-4124-a84d-81de7eb3e314.mp3</code> we generated, plus all the subtitle files with the same name for each mp3 file.</p>



<p>I&#8217;ll leave the rest up to your imagination! That&#8217;s it for part 2, I&#8217;ll see you soon in part 3, where we&#8217;ll be using Whisper to create a fully automatic video subtitling tool that takes only a video file as input, then transcribes the audio, creates subtitles, and embeds them into the video at the correct times! It will be fun, see you there!</p>



<h2 class="wp-block-heading">Full Course: OpenAI Whisper &#8211; Building Cutting-Edge Python Apps with OpenAI Whisper</h2>



<p>Check out our full OpenAI Whisper course with video lessons, easy explanations, GitHub, and a downloadable PDF certificate to prove your speech processing skills to your employer and freelancing clients:</p>


<div class="wp-block-image">
<figure class="aligncenter size-full"><a href="https://academy.finxter.com/university/openai-whisper/"><img loading="lazy" decoding="async" width="908" height="257" src="https://blog.finxter.com/wp-content/uploads/2024/01/image-154.png" alt="" class="wp-image-1654506" srcset="https://blog.finxter.com/wp-content/uploads/2024/01/image-154.png 908w, https://blog.finxter.com/wp-content/uploads/2024/01/image-154-300x85.png 300w, https://blog.finxter.com/wp-content/uploads/2024/01/image-154-768x217.png 768w" sizes="auto, (max-width: 908px) 100vw, 908px" /></a></figure>
</div>


<p><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f449.png" alt="👉" class="wp-smiley" style="height: 1em; max-height: 1em;" /> [<strong>Academy</strong>] <a href="https://academy.finxter.com/university/openai-whisper/" data-type="link" data-id="https://academy.finxter.com/university/openai-whisper/">Voice-First Development: Building Cutting-Edge Python Apps Powered By OpenAI Whisper</a></p>
<p>The post <a href="https://blog.finxter.com/openai-whisper-building-a-podcast-transcribing-app-in-python/">OpenAI Whisper &#8211; Building a Podcast Transcribing App in Python</a> appeared first on <a href="https://blog.finxter.com">Be on the Right Side of Change</a>.</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>OpenAI Whisper &#8211; Python Installation, Setup, &#038; First Steps to Speech-to-Text Synthesis</title>
		<link>https://blog.finxter.com/openai-whisper-python-installation-setup-first-steps-to-speech-to-text-synthesis/</link>
		
		<dc:creator><![CDATA[Dirk van Meerveld]]></dc:creator>
		<pubDate>Thu, 25 Jan 2024 19:55:30 +0000</pubDate>
				<category><![CDATA[Artificial Intelligence]]></category>
		<category><![CDATA[Large Language Model (LLM)]]></category>
		<category><![CDATA[OpenAI]]></category>
		<category><![CDATA[Prompt Engineering]]></category>
		<category><![CDATA[Speech Recognition and Generation]]></category>
		<guid isPermaLink="false">https://blog.finxter.com/?p=1654502</guid>

					<description><![CDATA[<p>🎙️ Course: This article is based on a lesson from our Finxter Academy Course Voice-First Development: Building Cutting-Edge Python Apps Powered By OpenAI Whisper. Check it out for video lessons, GitHub, and a downloadable PDF course certificate with your name on it! Welcome to this first part of the Whisper course. My name is Dirk ... <a title="OpenAI Whisper &#8211; Python Installation, Setup, &#038; First Steps to Speech-to-Text Synthesis" class="read-more" href="https://blog.finxter.com/openai-whisper-python-installation-setup-first-steps-to-speech-to-text-synthesis/" aria-label="Read more about OpenAI Whisper &#8211; Python Installation, Setup, &#038; First Steps to Speech-to-Text Synthesis">Read more</a></p>
<p>The post <a href="https://blog.finxter.com/openai-whisper-python-installation-setup-first-steps-to-speech-to-text-synthesis/">OpenAI Whisper &#8211; Python Installation, Setup, &#038; First Steps to Speech-to-Text Synthesis</a> appeared first on <a href="https://blog.finxter.com">Be on the Right Side of Change</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<p class="has-base-2-background-color has-background"><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f399.png" alt="🎙" class="wp-smiley" style="height: 1em; max-height: 1em;" /> <strong>Course</strong>: This article is based on a lesson from our <strong>Finxter Academy Course</strong> <a href="https://academy.finxter.com/university/openai-whisper/"><em>Voice-First Development: Building Cutting-Edge Python Apps Powered By OpenAI Whisper</em></a>. Check it out for video lessons, GitHub, and a downloadable PDF course certificate with your name on it!</p>



<p>Welcome to this first part of the Whisper course. My name is Dirk van Meerveld and it is my pleasure to be your host and guide for this tutorial series where we will be looking at OpenAI&#8217;s amazing speech-to-text model called Whisper.</p>



<p>We&#8217;ll first take a look at what it is and how its basic usage works, and then we&#8217;ll explore ways in which we can practically use it in our projects. Along the way, we&#8217;ll learn about the balance between model size and accuracy, and in the final part, we&#8217;ll look at alternative options to speed it up or outsource the processing to OpenAI&#8217;s servers.</p>



<p>The local installation process should not be too much of a problem but is a bit different for all operating systems and system setups. Unfortunately, I cannot cover every single possible system setup configuration, so you may have to do some googling and trial and error along the way.</p>



<p>This is an inevitable part of software development, don&#8217;t give up and you will always get it working eventually, we all get stuck trying to get something to work with our particular system sometimes, it&#8217;s just part of the job.</p>



<p>If you do not like a particular configuration like running the model locally, rest assured we will cover both the different ways to run Whisper and various implementation projects over the series, so just watch through the whole thing and then take whatever projects you like and combine it with whatever version of running Whisper you liked.</p>



<h2 class="wp-block-heading">Installing Whisper</h2>



<p>First, we need to install Whisper. We&#8217;ll be using the pip package manager for this, so make sure you have that installed, but you should if you&#8217;re a Python user. In a terminal window run the following command:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">pip install -U openai-whisper</pre>



<p>The <code>-U</code> flag in the <code>pip install -U openai-whisper</code> command stands for <code>--upgrade</code>. It means that Whisper will either be installed or upgraded to the latest version if it is already installed.</p>



<p>The second thing we need to have installed is <code>ffmpeg</code>. What is <code>ffmpeg</code>? FFmpeg is a versatile multimedia framework that allows us to work with audio and video files. It supports a wide range of formats, and is highly portable, running on pretty much any operating system.</p>



<p>The simplest way to install <code>ffmpeg</code> is to use a package manager. If you&#8217;re on Windows, you can use <a href="https://chocolatey.org/install">Chocolatey</a> to install <code>ffmpeg</code> by running the following command in a terminal window:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""># on Windows / Chocolatey
choco install ffmpeg</pre>



<p>If you&#8217;re on MacOS using Homebrew, you can install <code>ffmpeg</code> by running the following command in a terminal window:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""># on MacOS / Homebrew
brew install ffmpeg</pre>



<p>If you&#8217;re on Linux, well you probably know what to do and don&#8217;t need instructions! <code>sudo apt update &amp;&amp; sudo apt install ffmpeg</code></p>



<p>This may be the most challenging part of the tutorial series, to be honest. You may not run into any issues if your system is already set up well, or you may need to do quite some googling and setup work to get everything up and running. It took me some messing around to get everything working properly on my system and it&#8217;s unfortunately impossible to know exactly what you will need to do to resolve any issues you may run into. Google is your friend! Remember we&#8217;ll also cover the API in part 4 if you don&#8217;t want to run the model locally, but don&#8217;t just skip ahead as you&#8217;ll miss out on a lot of useful information.</p>



<h2 class="wp-block-heading">What is Whisper?</h2>



<p>Whisper is a speech-to-text model developed by OpenAI. What is really cool is that they open-source released this model to the public. It is a neural network that takes audio as input and outputs text. It is trained on a large dataset of audio and text pairs and has learned the text that corresponds to the audio. What is exciting about the model is that it&#8217;s not just effective at transcribing high-quality &#8216;gold-standard&#8217; audio that has been recorded on studio microphones, but is also very good at transcribing audio that has considerably lower quality, or even imperfect pronunciation with a foreign accent. If you compare it with auto-generated subtitles from Youtube, for example, you will see that it really is a level apart.</p>



<p>Instead of diving deep into the model&#8217;s architecture and technical details that make it work behind the scenes, this course will focus on the practical application of what we can do with it and how to use it to make cool stuff.</p>



<h2 class="wp-block-heading">Model sizes</h2>



<p>There are different sizes available for the Whisper model. The smaller the size of the model, the less processing power and VRAM it needs, and the faster it will run. This comes at the cost of a lower accuracy. On the contrary, the larger the model size, the more processing power and VRAM it needs, and the longer it will take to run, but the more accurate it will be and the better it will deal with foreign languages, noise, and poor audio quality.</p>



<figure class="wp-block-table"><table><thead><tr><th>Size</th><th>Parameters</th><th>English-only model</th><th>Multilingual model</th><th>Required VRAM</th><th>Relative Speed</th></tr></thead><tbody><tr><td>tiny</td><td>39M</td><td>tiny.en</td><td>tiny</td><td>~1GB</td><td>~32x</td></tr><tr><td>base</td><td>74M</td><td>base.en</td><td>base</td><td>~1GB</td><td>~16x</td></tr><tr><td>small</td><td>244M</td><td>small.en</td><td>small</td><td>~2GB</td><td>~6x</td></tr><tr><td>medium</td><td>769M</td><td>medium.en</td><td>medium</td><td>~5GB</td><td>~2x</td></tr><tr><td>large</td><td>1550M</td><td>N/A</td><td>large</td><td>~10GB</td><td>1x</td></tr></tbody></table></figure>



<p>As we can see in this table from the <a href="https://github.com/openai/whisper">Whisper GitHub</a>, we have 5 different model sizes in total. There are 4 sizes for the English-only model, namely <code>tiny.en</code>, <code>base.en</code>, <code>small.en</code>, and <code>medium.en</code>. As this model only deals with the English language it is highly recommended to use one of these when you know you&#8217;re going to be transcribing English as these models are specialized at only dealing with English and therefore will give greater accuracy at a much smaller model size and run-time. This is why there is no <code>large.en</code> model as the <code>medium.en</code> model is already sufficient in size to equal the accuracy of the <code>large</code> multilingual model.</p>



<p>For the multilingual models, we have the <code>tiny</code>, <code>base</code>, <code>small</code>, <code>medium</code>, and <code>large</code> sizes. This whisper is trained on a whopping 680,000 hours of audio data covering a total of 97 different languages, though the performance does vary per language as more obscure languages may not work quite as well. The larger the model size the easier it will deal with such languages, specific accents, and poor audio quality.</p>



<p>Now if you don&#8217;t have 10GB of VRAM, don&#8217;t worry, you can often get away with using the smaller-size models as you will see. Later on, in the last part of the series, we&#8217;ll look at smaller &#8216;distilled&#8217; versions of the model that can help us optimize speed further, or just outsourcing the processing to the lighting-fast OpenAI servers. Just keep watching! That being said, I actually recommend you always use the smallest version that you can get away with for your specific task. There is simply no point in adding more cost and complexity to your apps. If you don&#8217;t need it the extra model size will only slow down and raise the cost of your application.</p>



<h2 class="wp-block-heading">Basic usage</h2>



<p>Now that we have Whisper, fire up your favorite code editor, and let&#8217;s get started! I&#8217;ll be using VSCode, but you can use whatever IDE you like. Create a root folder for your project, I&#8217;ll call mine <code>FINX_WHISPER</code>, and then inside make a new file called <code>1_basic_call_english_only.py</code>. (I&#8217;m using numbers for the file names so you can easily reference them later when you are busy coding some cool new project, but this is obviously not a good general naming convention):</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />FINX_WHISPER (project root folder)
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />1_basic_call_english_only.py</pre>



<p>Then open up the new Python file and start with the imports:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">import whisper
from pathlib import Path</pre>



<p>The <code>whisper</code> import is obvious, and <code>pathlib</code> will help us get the path to the audio files we want to transcribe, this way our Python file will be able to locate our audio files even if the terminal window is not currently in the same directory as the Python file. Now let&#8217;s declare some constants:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">MODEL = whisper.load_model("base.en")
AUDIO_DIR = Path(__file__).parent / "test_audio_files"</pre>



<p>First, we declare <code>MODEL</code> and load the <code>base.en</code> model. We start with the second-smallest English-only model and will scale up if and when we need to. Then we declare <code>AUDIO_DIR</code> and use <code>pathlib</code> to get the path. This works by first getting the path to the current file (<code>1_basic_call_english_only.py</code>), using <code>__file__</code>, and then getting the parent directory of that file, using <code>.parent</code>. Then we add the <code>test_audio_files</code> folder to the path using the <code>/</code> operator. This way we can easily access the audio files in the <code>test_audio_files</code> folder from our Python file.</p>



<p>Now let&#8217;s create the <code>test_audio_files</code> as it doesn&#8217;t actually exist, make sure you spell it correctly:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />FINX_WHISPER (project root folder)
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />test_audio_files
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />1_basic_call_english_only.py</pre>



<p>Then go ahead and add the audio files provided into the folder. They should be provided together with this video tutorial, but if for any reason you cannot find them, go to the Finxter GitHub repository for this course or you can find a copy at:</p>



<figure class="wp-block-embed"><div class="wp-block-embed__wrapper">
https://github.com/DirkMeer/finx_whisper
</div></figure>



<p>Download all the test files and put them in the folder (you can also add your own audio files if you want to, these are just provided for your convenience):</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />FINX_WHISPER (project root folder)
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />test_audio_files
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f50a.png" alt="🔊" class="wp-smiley" style="height: 1em; max-height: 1em;" />dutch_long_repeat_file.mp3
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f50a.png" alt="🔊" class="wp-smiley" style="height: 1em; max-height: 1em;" />dutch_the_netherlands.mp3
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f50a.png" alt="🔊" class="wp-smiley" style="height: 1em; max-height: 1em;" />high_quality.mp3
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f50a.png" alt="🔊" class="wp-smiley" style="height: 1em; max-height: 1em;" />low_quality.mp3
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f50a.png" alt="🔊" class="wp-smiley" style="height: 1em; max-height: 1em;" />terrible_quality.mp3
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />1_basic_call_english_only.py</pre>



<p>Ok, back to our <code>1_basic_call_english_only.py</code> file. Below the <code>MODEL</code> and <code>AUDIO_DIR</code> variables, let&#8217;s create a function that will transcribe the audio files for us:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">def get_transcription(audio_file: str):
    result = MODEL.transcribe(audio_file)
    print(result)
    return result</pre>



<p>This function takes an audio file&#8217;s path in string format as input. We then call the <code>.transcribe()</code> method Whisper provides for us, and pass in the audio file&#8217;s path in string format. Then we simply print and return the result for a basic test. Looks really simple right?</p>



<p>First, let&#8217;s try and transcribe a high-quality English audio file, as a sort of best-case scenario:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">get_transcription(str(AUDIO_DIR / "high_quality.mp3"))</pre>



<p>Notice that the function we wrote above takes a path as a string variable. This is because Whisper requires the path to the audio file as a string. <code>AUDIO_DIR / "high_quality.mp3"</code> returns a <code>Path</code> object, so we use <code>str()</code> to convert it to a string, or else Whisper will crash.</p>



<h2 class="wp-block-heading">Getting a transcription</h2>



<p>So go ahead and save and run the file, and you will see a large object containing all the output. Let&#8217;s take a quick look at the information available to us here, read the comments for an explanation:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">{
    # First we get the full transcription
    "text": " Hi guys, this is just a quick test audio file for you. Let's see how well it does and if my speech is recognized and converted to text properly. I'm really excited to see how well this works and I hope that it will be a good test for you guys to see how well the whisper model works.",
    # Now we have the list of segments
    "segments": [
        {
            "id": 0,
            "seek": 0,
            # Start and end times in seconds
            "start": 0.0,
            "end": 3.52,
            "text": " Hi guys, this is just a quick test audio file for you.",
            # list of tokenized words from the transcription, where each word is represented by a unique number
            "tokens": [ 50363, 15902, 3730, 11, 428, 318, 655, 257, 2068, 1332, 6597, 2393, 329, 345, 13, 50539 ],
            "temperature": 0.0,
            # In the context of machine learning, temperature is a parameter that controls the randomness of predictions. A temperature of 0.0 suggests no randomness, or the model always selecting the tokens(words) with the highest probability (This is similar to the ChatGPT API temperature setting). You can pass a temperature value to the transcribe function when calling it if you want to introduce more randomness into your generations.
            # For instance: model.transcribe(audio_file, temperature=0.2)
            "avg_logprob": -0.1399546700554925,
            # The average log probability of the tokens in the segment. The closer to 0 the better, which means if the numbers get more negative, like -0.2 for instance, it means it's much less confident in it's transcription (and there are probably more errors).
            "compression_ratio": 1.5898876404494382,
            "no_speech_prob": 0.0045762090012431145,
            # Represents the probability that the segment contains no speech. We can see that it is very low.
        },
        {
            '... more segments with the same structure as above, cut for brevity ...'
        },
    ],
    "language": "en",
}</pre>



<p>As we can see, we really get a lot of information back from the model! What is most interesting is of course the transcription itself. Notice that it is a perfect word-for-word transcription even though we used the second smallest <code>base.en</code> model possible. Very impressive for such a small version of the real model! Now let&#8217;s try a lower-quality audio file:</p>



<p>replace the last call:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">get_transcription(str(AUDIO_DIR / "high_quality.mp3"))</pre>



<p>with:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">get_transcription(str(AUDIO_DIR / "low_quality.mp3"))</pre>



<p>And when we run this with the considerably lower quality audio file, still on the <code>base.en</code> model, I still get a perfect transcription. If we look closely at the output object though we can clearly see the <code>avg_logprob</code> (explained above) has moved further away from 0, moving from <code>-0.1399546700554925</code> to <code>-0.2179246875974867</code> indicating the model is now much less confident in it&#8217;s transcription (though still correct).</p>



<p>Now let&#8217;s try a really poor-quality audio file:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">get_transcription(str(AUDIO_DIR / "terrible_quality.mp3"))</pre>



<p>And if we run this we can see that it is still half correct even though a human would have trouble understanding it:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">Hi guys. This is just a quick test audio file for you. Let's see how well it does and if my speech is recognized, thank you for the context properly. I'm really excited to see how well this works and I hope that it will be a quick test for you guys to see how well the whisper model works.</pre>



<p>We have clearly reached the limits of the base model here as part of this is incorrect, and it&#8217;s time to step up to a bigger model size. (Remember, you generally want to use the smallest model you can get away with for your use case!)</p>



<p>I&#8217;m going to change the model to <code>small.en</code> by editing the <code>MODEL</code> variable at the top of our file:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">MODEL = whisper.load_model("small.en")</pre>



<p>Now if we run it again:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">Hi guys, this is just a quick test audio file for you. Let's see how well it does, and if my speech is recognized and converted to text properly, I'm really excited to see how well this works, and I hope that it will be a good test for you guys to see how well the Whisper model works.</pre>



<p>There is an awkward super-long sentence with a bit too many commas but apart from that it&#8217;s perfect, even though the audio quality of this file is pretty terrible. Switching to <code>medium.en</code> fixes the last small imperfection with the multiple commas by the way. This is the power of Whisper!</p>



<h2 class="wp-block-heading">Taking a deeper look</h2>



<p>Now let&#8217;s take a slightly deeper look at what is happening inside Whisper while looking at using other languages and even translation. Make a new file in your root folder called <code>1_multiple_languages.py</code>:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />FINX_WHISPER (project root folder)
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />test_audio_files
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />1_basic_call_english_only.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />1_multiple_languages.py</pre>



<p>Then open up the new <code>1_multiple_languages.py</code> file and start with the imports:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">import whisper
from pathlib import Path

AUDIO_DIR = Path(__file__).parent / "test_audio_files"
model = whisper.load_model("base")</pre>



<p>Make sure to use the <code>base</code> model this time, and not the <code>base.en</code> model, as we want to use all available languages.</p>



<p>First, we&#8217;ll take a slightly deeper down look to have a rough idea of what is going on as this will help us understand some important nuances. After that, we&#8217;ll greatly simplify the whole thing using the higher-level code again. Let&#8217;s write a function that will detect the language and transcribe a file for us and we&#8217;ll explain it line by line.</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">def detect_language_and_transcribe(audio_file: str):
    audio = whisper.load_audio(audio_file)</pre>



<p>We define a function, which takes the path to an <code>audio_file</code> as a string argument. We then call Whisper&#8217;s <code>.load_audio()</code> method and pass in the audio file&#8217;s path. This returns a NumPy array containing the audio waveform, in float32 datatype, or in other words, an array containing the audio data as a giant list of numbers.</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">    audio = whisper.pad_or_trim(audio)</pre>



<p>Next, we get a 30-second sample, either padding with silence if the file is shorter than 30 seconds or trimming it if it is longer. This is because the Whisper model is built and trained to take 30 seconds of audio as its input data each time. This doesn&#8217;t mean you cannot transcribe longer files but does have some implications we&#8217;ll get back to later.</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">    mel = whisper.log_mel_spectrogram(audio).to(model.device)</pre>



<p>Make a log-Mel spectrogram and move it to the same device as the model (e.g. your GPU). A log-Mel spectrogram is a representation of a sound or audio signal that has been transformed to highlight certain perceptual characteristics.</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f468-200d-1f3eb.png" alt="👨‍🏫" class="wp-smiley" style="height: 1em; max-height: 1em;" /> Spectrogram: A spectrogram is a visual representation of the spectrum of frequencies in a sound or other signal as they vary with time. It's essentially a heat map where x is time, the y-axis is frequency, and the color represents the loudness.

<img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f468-200d-1f3eb.png" alt="👨‍🏫" class="wp-smiley" style="height: 1em; max-height: 1em;" /> Mel Scale: The Mel scale is a perceptual scale of pitches that emulates the human ear's response to different frequencies. We humans are much better at distinguishing small changes in pitch at low frequencies than at high frequencies. The Mel scale makes the representation match more closely with human perception as opposed to the exact mathematical frequencies.

<img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f468-200d-1f3eb.png" alt="👨‍🏫" class="wp-smiley" style="height: 1em; max-height: 1em;" /> Logarithmic Scale: Taking the logarithm of the spectrogram values is another step to make the representation more closely match human perception. We perceive loudness on a logarithmic scale (which is why we use decibels, a logarithmic measurement, to express the loudness of sound).

<img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f468-200d-1f3eb.png" alt="👨‍🏫" class="wp-smiley" style="height: 1em; max-height: 1em;" /> Combining these, a log-Mel spectrogram is a representation of sound that is designed to highlight the aspects that are most important for human perception. It's commonly used in audio processing tasks, including speech and music recognition.</pre>



<p>Now that we have this log-Mel spectrogram, we can use it to detect the language of our audio file. We do this by passing it to the <code>.detect_language()</code> method of our model:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">    language_token, language_probs = model.detect_language(mel)</pre>



<p>This returns the <code>language_token</code>, which is a number we will not be using, and the <code>language_probs</code> which is a huge list of numbers indicating the probability for possible languages matching the sound file. As we won&#8217;t actually be using the <code>language_token</code> variable we can replace it with a <code>_</code> to indicate that we won&#8217;t be using it. This makes it into a sort of throwaway variable that we don&#8217;t care about.</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">    _, language_probs = model.detect_language(mel)</pre>



<p>Let&#8217;s take what we have so far, add a print statement to check out the <code>language_probs</code>, and run it using the <code>dutch_the_netherlands.mp3</code> file I prepared for you:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">def detect_language_and_transcribe(audio_file: str):
    audio = whisper.load_audio(audio_file)
    audio = whisper.pad_or_trim(audio)
    mel = whisper.log_mel_spectrogram(audio).to(model.device)
    _, language_probs = model.detect_language(mel)
    print(language_probs)

detect_language_and_transcribe(str(AUDIO_DIR / "dutch_the_netherlands.mp3"))</pre>



<p>Now when we run this we can see the massive <code>language_probs</code> list printed to our console:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">{
    '.. cut for brevity ..'
    "yi": 2.012418735830579e-05,
    "ka": 2.161949907986127e-07,
    "nl": 0.9650669693946838,
    "en": 0.010499916970729828,
    "ko": 9.358442184748128e-05,
    "mn": 5.96029394728248e-06,
    "de": 0.010318436659872532,
    '.. cut for brevity ..'
}</pre>



<p>We have a huge list of numbers here as you can see. The higher the number the more likely the the language, many are to the power of <code>-4</code>, <code>-5</code>, <code>-6</code>, or even lower. We can clearly see that <code>nl</code> (the Netherlands) is by far the highest probability, close to a perfect 1 score with <code>0.965</code>. The second and third highest are <code>en</code> (English) and <code>de</code> (German) with <code>0.010</code> and <code>0.010</code> respectively which is not even close so we can be very confident that this is Dutch. Impressive for the <code>base</code> model that small that deals with so many languages, and Dutch not really being that big a language.</p>



<p>Of course, we don&#8217;t want this whole list, we just want to know the most probable language, so we can use the max function to get the highest probability.</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">def detect_language_and_transcribe(audio_file: str):
    ...
    language: str = max(language_probs, key=language_probs.get)
    print(f"Detected language: {language}")</pre>



<p><code>max</code> returns the key of the largest value in the dictionary. We pass in the dictionary as the first argument. The <code>key</code> argument is a function that is called on each item in the dictionary, and the item for which the function returns the largest value is the result of the <code>max</code> function. We can just use the built-in <code>.get()</code> method as the function to get the value of each item in the dictionary.</p>



<p>The language name codes are in ISO 639-1 format and can be found <a href="https://en.wikipedia.org/wiki/List_of_ISO_639_language_codes">here</a>. We add a print statement to print the detected language. I removed the previous print statement <code>print(language_probs)</code> we added before.</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">def detect_language_and_transcribe(audio_file: str):
    ...
    language: str = max(language_probs, key=language_probs.get)
    print(f"Detected language: {language}")
    options = whisper.DecodingOptions(language=language, task="transcribe")
    result = whisper.decode(model, mel, options)
    print(result)
    return result.text</pre>



<p>Now we&#8217;ll decode this 30-second audio file into text. First, we create a <code>DecodingOptions</code> object and save it in the variable named options. The <code>DecodingOptions</code> object lets you set more advanced decoding options, but we&#8217;ll stick to basics for now, passing in the <code>language</code> we detected and the task of &#8220;transcribe&#8221;. We then call the <code>whisper.decode</code> function which performs decoding of the 30-second audio segment(s), provided as log-Mel spectrogram(s). We pass in the model, the mel spectrogram, and the options. This returns a <code>DecodingResult</code> object which we save in the variable named <code>result</code>. We then print the <code>result</code> and return the <code>result.text</code>.</p>



<p>The whole function now looks like this:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">def detect_language_and_transcribe(audio_file: str):
    audio = whisper.load_audio(audio_file)
    audio = whisper.pad_or_trim(audio)
    mel = whisper.log_mel_spectrogram(audio).to(model.device)
    _, language_probs = model.detect_language(mel)
    language: str = max(language_probs, key=language_probs.get)
    print(f"Detected language: {language}")
    options = whisper.DecodingOptions(language=language, task="transcribe")
    result = whisper.decode(model, mel, options)
    print(result)
    return result.text</pre>



<p>Now let&#8217;s run it with the <code>dutch_the_netherlands.mp3</code> file again:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">dutch_test = detect_language_and_transcribe(
    str(AUDIO_DIR / "dutch_the_netherlands.mp3")
)</pre>



<p>When you run this the object printed to the console will have the following transcription:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">'Hoi, allemaal. Dit is weer een testbestandje. Deze keer om te testen of de Nederlandse taal goed herkend gaat worden. Hierna kunnen we ook proberen deze text te laten vertalen naar het Engels om te zien hoe goed dat gaat. Ik ben benieuwd.'</pre>



<p>There we go, a perfect transcription! Now you probably don&#8217;t speak Dutch, but the above is a perfect word-for-word transcription of the spoken text.</p>



<h2 class="wp-block-heading">Back to .transcribe</h2>



<p>Now I&#8217;ll be honest, that was a little bit overcomplicated if we don&#8217;t need to do much personalization and just want to call the model. Also, we don&#8217;t want to limit ourselves to just 30 seconds of audio. Let&#8217;s take it back to whisper&#8217;s higher level <code>.transcribe</code> function which basically does all the above for us.</p>



<p>Make sure you comment out the <code>dutch_test</code> code so it doesn&#8217;t keep running:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""># dutch_test = detect_language_and_transcribe(
#     str(AUDIO_DIR / "dutch_the_netherlands.mp3")
# )</pre>



<p>Now all we need to do to use <code>.transcribe</code> is load a model (<code>model = whisper.load_model("base")</code>) which we already did in this file, and then call the <code>.transcribe</code> method on the model and pass in the path to the audio file as a string:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">result = model.transcribe(str(AUDIO_DIR / "dutch_the_netherlands.mp3"), verbose=True)
print(result["text"])</pre>



<p>It also has some options, in this case, we&#8217;ve set <code>verbose</code> to <code>True</code> so it will give us extra information in the console. If you go ahead and run this you will get the exact same transcription in the output as we did above:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">'Hoi, allemaal. Dit is weer een testbestandje. Deze keer om te testen of de Nederlandse taal goed herkend gaat worden. Hierna kunnen we ook proberen deze text te laten vertalen naar het Engels om te zien hoe goed dat gaat. Ik ben benieuwd.'</pre>



<p>Again, you probably don&#8217;t speak Dutch, but that&#8217;s not the point. So underneath the hood, the <code>.transcribe</code> function reads the entire audio file and basically processes it in 30-second windows. You could also see it did the language detection part for us automatically before starting.</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">Detecting language using up to the first 30 seconds. Use `--language` to specify the language
Detected language: Dutch</pre>



<h2 class="wp-block-heading">Working with longer files</h2>



<p>So that&#8217;s pretty good, right? Well, let&#8217;s try a longer audio file and see what happens. I&#8217;ve provided <code>dutch_long_repeat_file.mp3</code> which is just the same audio file but it repeats 3 times, totaling just over 40 seconds. Let&#8217;s see what happens when we try to transcribe this file (make sure you comment out the run above):</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""># result = model.transcribe(str(AUDIO_DIR / "dutch_the_netherlands.mp3"), verbose=True)
# print(result["text"])


result = model.transcribe(
    str(AUDIO_DIR / "dutch_long_repeat_file.mp3"),
    verbose=True,
    language="nl",
    task="transcribe",
)
print(result["text"])</pre>



<p>Note we can pass in the language if we already know it, so we can skip the detection step and save some time there. So for applications where you always know the language ahead of time just pass it in to optimize your application. We pass in <code>nl</code> as it is the ISO-639-1 code for the Netherlands.</p>



<p>Now let&#8217;s run this and check the output (yours will look different from mine):</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">Hoi j allemaal! Dit is weer een testbestandje! Deze keer om te testen of de Nederlandse taal goed herkent gaat worden. Je en bırak�� collecte geval. Je gievous raakt deze tekst te laten vertalen naar het Engels om te zien hoe goed dat gaat. Ik ben benieuwd! Hoi jlynn allemaal! Dit is weer een testbestandje. Deze keer om te testen of de Nederlandse taal goed herkent gaat worden. Je en driesbredmontie kunt wiring die text er metυτ�� mesma halen te laten vertalen naar het Engels om te zien hoe goed dat gaat! Ik ben benieuwd. Hoi allemaal! Dit is weer een testbestandje. Deze keer om te testen of de Nederlandse taal goed herkend gaat worden. Hierna kunnen we ook proberen deze tekst te laten vertalen naar het Engels om te zien hoe goed dat gaat. Ik ben benieuwd.</pre>



<p>Now I&#8217;m not going to make you read this, but as a Dutch person, I will tell you this output is terrible and there are several characters and many words here that do not even exist in the Dutch language! So what happened? It&#8217;s the same model and the audio file is exactly the same as before, it&#8217;s just a bit longer and repeats itself. We should have gotten the same output right?</p>



<p>Well, it is because Whisper&#8217;s machine-learning model is limited to audio segments of only 30 seconds as its input. Because of this, it is more challenging for it to transcribe longer audio files. The <code>.transcribe</code> function took care of cutting the audio into 30-second segments for us and feeding them through and sort of stitching them back together, making our life a lot easier, so we didn&#8217;t really notice this extra challenge.</p>



<p>While whisper does use some clever tricks to improve the quality for transcribing longer audio files that need to be cut into 30-second pieces and put back together again this is inherently just a bit trickier so we saw a significant drop in transcription quality even though the audio we were transcribing was the exact same as before (just repeated 3 times in a row to make it longer).</p>



<p>Does this mean Whisper is only good for small files? Not at all! All we need to solve this bigger challenge of a minor language (Dutch) combined with files longer than 30 seconds is to just step up to a bigger model!</p>



<p>When changing the model to <code>small</code> instead of <code>base</code>:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">model = whisper.load_model("small")</pre>



<p>I got an almost perfect output with only a single very minor spelling mistake. When I changed to <code>medium</code> afterward it was absolutely perfect. It&#8217;s just a matter of using a bigger model until it works. Pick the model size that corresponds to the size of your challenge.</p>



<h2 class="wp-block-heading">Translating</h2>



<p>Besides just transcribing, as if that wasn&#8217;t awesome enough, Whisper can also translate pretty much all major languages to English. (If you get very hacky it can even translate English to other languages, but that is not an intended or supported feature).</p>



<p>So now let&#8217;s give it an audio file in a non-English language and then ask it for an English translation. We&#8217;ll feed it the <code>dutch_the_netherlands.mp3</code> file again, but this time ask it for a translation (to English) so you can finally find out what I said in the audio!</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">result = model.transcribe(
    str(AUDIO_DIR / "dutch_the_netherlands.mp3"),
    verbose=True,
    language="nl",
    task="translate",
)
print(result["text"])</pre>



<p>Make sure you comment out any calls above so you don&#8217;t run them by accident. I&#8217;ve already tested this out and you&#8217;ll need to load around the <code>medium</code> model size to get a good translation, so make sure you load that BEFORE the call above (if your computer can handle it, otherwise just try a smaller one).</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">model = whisper.load_model("medium")</pre>



<p>The output is:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">Hey everyone, this is a test file again. This time to test whether the Dutch language will be recognized well. After this, we can also try to translate this text into English to see how well that goes. I'm curious.</pre>



<p>It&#8217;s really quite a decent translation, straight from spoken text. That is very impressive. For sloppy pronunciation it still works quite well &#8211; I tested this using my Korean pronunciation which is not great and the results were still pretty good.</p>



<p>So the different languages, longer files or perhaps slightly less native pronunciation will benefit a lot from going to larger versions of the model (as long as you have the VRAM for it). I&#8217;ll be sticking with the lower end of the spectrum models for this series as much as possible, as not everyone will have the GPU to run the larger models, but feel free to use a larger model if you have the VRAM for it.</p>



<p>On the flip side, if you can only run the small or even the base models, do not despair! The next two tutorials will actually do very well for accuracy running on these smaller models, and again, in the last part, we&#8217;ll look at speeding up, optimizing, or outsourcing the processing altogether.</p>



<p>Now that we&#8217;ve got the more boring basics out of the way, it&#8217;s time to build some cool and fun stuff and look at practical applications and integration in the next couple of parts! See you there!</p>



<h2 class="wp-block-heading">Full Course: OpenAI Whisper &#8211; Building Cutting-Edge Python Apps with OpenAI Whisper</h2>



<p>Check out our full OpenAI Whisper course with video lessons, easy explanations, GitHub, and a downloadable PDF certificate to prove your speech processing skills to your employer and freelancing clients:</p>


<div class="wp-block-image">
<figure class="aligncenter size-full"><a href="https://academy.finxter.com/university/openai-whisper/"><img loading="lazy" decoding="async" width="908" height="257" src="https://blog.finxter.com/wp-content/uploads/2024/01/image-154.png" alt="" class="wp-image-1654506" srcset="https://blog.finxter.com/wp-content/uploads/2024/01/image-154.png 908w, https://blog.finxter.com/wp-content/uploads/2024/01/image-154-300x85.png 300w, https://blog.finxter.com/wp-content/uploads/2024/01/image-154-768x217.png 768w" sizes="auto, (max-width: 908px) 100vw, 908px" /></a></figure>
</div>


<p><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f449.png" alt="👉" class="wp-smiley" style="height: 1em; max-height: 1em;" /> [<strong>Academy</strong>] <a href="https://academy.finxter.com/university/openai-whisper/" data-type="link" data-id="https://academy.finxter.com/university/openai-whisper/">Voice-First Development: Building Cutting-Edge Python Apps Powered By OpenAI Whisper</a></p>



<p></p>
<p>The post <a href="https://blog.finxter.com/openai-whisper-python-installation-setup-first-steps-to-speech-to-text-synthesis/">OpenAI Whisper &#8211; Python Installation, Setup, &#038; First Steps to Speech-to-Text Synthesis</a> appeared first on <a href="https://blog.finxter.com">Be on the Right Side of Change</a>.</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>OpenAI JSON Mode &#038; DALL·E 3 Seeds (Examples)</title>
		<link>https://blog.finxter.com/openai-json-mode-dall%c2%b7e-3-seeds-examples/</link>
		
		<dc:creator><![CDATA[Dirk van Meerveld]]></dc:creator>
		<pubDate>Tue, 28 Nov 2023 11:50:35 +0000</pubDate>
				<category><![CDATA[Artificial Intelligence]]></category>
		<category><![CDATA[GPT-4]]></category>
		<category><![CDATA[Large Language Model (LLM)]]></category>
		<category><![CDATA[OpenAI]]></category>
		<category><![CDATA[Prompt Engineering]]></category>
		<category><![CDATA[Python]]></category>
		<guid isPermaLink="false">https://blog.finxter.com/?p=1653301</guid>

					<description><![CDATA[<p>🔗 This article originally appeared on the Finxter Academy for premium members (including course lesson video). Check out the video course here. All right, welcome back to part 2, where we&#8217;re going to be looking at JSON mode and seeds. ✅ This will allow us to use only a part of function calls. Namely, when ... <a title="OpenAI JSON Mode &#038; DALL·E 3 Seeds (Examples)" class="read-more" href="https://blog.finxter.com/openai-json-mode-dall%c2%b7e-3-seeds-examples/" aria-label="Read more about OpenAI JSON Mode &#038; DALL·E 3 Seeds (Examples)">Read more</a></p>
<p>The post <a href="https://blog.finxter.com/openai-json-mode-dall%c2%b7e-3-seeds-examples/">OpenAI JSON Mode &#038; DALL·E 3 Seeds (Examples)</a> appeared first on <a href="https://blog.finxter.com">Be on the Right Side of Change</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<p><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f517.png" alt="🔗" class="wp-smiley" style="height: 1em; max-height: 1em;" /> <em>This <a href="https://academy.finxter.com/1-6-openai-api-mastery-innovating-with-gpt-4-turbo-and-dall%c2%b7e-3-parallel-function-calling/" data-type="link" data-id="https://academy.finxter.com/1-6-openai-api-mastery-innovating-with-gpt-4-turbo-and-dall%c2%b7e-3-parallel-function-calling/">article</a> originally appeared on the Finxter Academy for premium members (including course lesson video). <a href="https://academy.finxter.com/university/openai-update/" data-type="link" data-id="https://academy.finxter.com/university/openai-update/">Check out the video course here</a>.</em></p>



<p>All right, welcome back to <a href="https://academy.finxter.com/course/openai-json-mode-and-seeds/" data-type="link" data-id="https://academy.finxter.com/course/openai-json-mode-and-seeds/">part 2</a>, where we&#8217;re going to be looking at JSON mode and seeds. </p>



<p class="has-base-2-background-color has-background"><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/2705.png" alt="✅" class="wp-smiley" style="height: 1em; max-height: 1em;" /> <strong>This will allow us to use only a part of function calls. Namely, when the model generates the arguments it wants to use to call the function, it will return these in valid JSON or <em>JavaScript Object Notation</em>. </strong></p>



<p>You saw in the <a href="https://academy.finxter.com/course/parallel-function-calling/" data-type="link" data-id="https://academy.finxter.com/course/parallel-function-calling/">previous part</a> that we parsed these arguments and then passed them into our functions.</p>



<p><strong>So what if we always want a JSON response from ChatGPT?</strong> We can now use the new JSON mode to do this. </p>



<p><strong>Why would this be useful?</strong> Well, it&#8217;s really easy to parse into an object, that we can manipulate with code or feed into some kind of software or API, just like we did in the <a href="https://blog.finxter.com/openai-parallel-function-calling-weathergpt-example/" data-type="post" data-id="1653283">previous part</a>. This can be really helpful for extracting data from text.</p>



<p>If we ask GPT to generate something in textual form, it&#8217;s pretty hard to use the output in our Python code, for example.</p>



<p>Still, if we ask it to output the data in JSON in exactly the way we specify, it&#8217;s very easy to parse this into a dictionary and then save the data in a database or manipulate it in some other way.</p>



<p>So, let&#8217;s get started with a simple example to see how this works. You&#8217;ll then be able to adapt this to your specific use case.</p>



<h2 class="wp-block-heading">Preparing Some Data</h2>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="1024" height="1024" src="https://blog.finxter.com/wp-content/uploads/2023/11/d274d91a-957f-47f3-b10b-b89357da4088.webp" alt="" class="wp-image-1653292" srcset="https://blog.finxter.com/wp-content/uploads/2023/11/d274d91a-957f-47f3-b10b-b89357da4088.webp 1024w, https://blog.finxter.com/wp-content/uploads/2023/11/d274d91a-957f-47f3-b10b-b89357da4088-300x300.webp 300w, https://blog.finxter.com/wp-content/uploads/2023/11/d274d91a-957f-47f3-b10b-b89357da4088-150x150.webp 150w, https://blog.finxter.com/wp-content/uploads/2023/11/d274d91a-957f-47f3-b10b-b89357da4088-768x768.webp 768w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /></figure>
</div>


<p>Let&#8217;s get something simple to extract data from. Remember the data could also be generated or acquired in some other way, the point here is the output. </p>



<p>Make a file called <code>chapters.py</code> in a new folder named <code>2_JSON_mode_and_seed</code>s like this:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />FINX_OPENAI_UPDATES (root project folder)
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />1_Parallel_function_calling
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />2_JSON_mode_and_seeds
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />chapters.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />.env</pre>



<p>Now go and visit <a href="https://gutenberg.org/cache/epub/72064/pg72064.txt">https://gutenberg.org/cache/epub/72064/pg72064.txt</a> in your browser. </p>


<div class="wp-block-image">
<figure class="aligncenter size-large"><img loading="lazy" decoding="async" width="910" height="1024" src="https://blog.finxter.com/wp-content/uploads/2023/11/image-131-910x1024.png" alt="" class="wp-image-1653302" srcset="https://blog.finxter.com/wp-content/uploads/2023/11/image-131-910x1024.png 910w, https://blog.finxter.com/wp-content/uploads/2023/11/image-131-267x300.png 267w, https://blog.finxter.com/wp-content/uploads/2023/11/image-131-768x864.png 768w, https://blog.finxter.com/wp-content/uploads/2023/11/image-131.png 1061w" sizes="auto, (max-width: 910px) 100vw, 910px" /></figure>
</div>


<p>This will take you to the text version of the book <strong><em>&#8220;The book of Scottish story: historical, humorous, legendary, and imaginative&#8221;</em></strong>, which is in the public domain (copyright expired), so we can use it for our example.</p>



<p>Copy the entire list of contents (it&#8217;s pretty long) all the way from <code>'The Henpecked Man'</code> to <code>'Catching a Tartar'</code> and paste it into the <code>chapters.py</code> file. </p>



<p>It should look like this:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">table_of_contents = """
CONTENTS.

The Henpecked Man,                              _John Mackay Wilson_

Duncan Campbell,                                _James Hogg_

...loads more entries in between...

The Fight for the Standard,                     _James Paterson_

Catching a Tartar,                              _D. M. Moir_
"""</pre>



<p>Notice it&#8217;s a simple variable named <code>table_of_contents</code> which is a very long multiline string so we can easily import this later. </p>



<p>The formatting of the table of contents is wonky with underscores, and some have &#8220;quotes&#8221; around them while others don&#8217;t, so this will make an excellent simple example.</p>



<h2 class="wp-block-heading">JSON Mode</h2>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="1024" height="1024" src="https://blog.finxter.com/wp-content/uploads/2023/11/4f57fb9c-c42f-4554-a086-2f2112f8c097.webp" alt="" class="wp-image-1653293" srcset="https://blog.finxter.com/wp-content/uploads/2023/11/4f57fb9c-c42f-4554-a086-2f2112f8c097.webp 1024w, https://blog.finxter.com/wp-content/uploads/2023/11/4f57fb9c-c42f-4554-a086-2f2112f8c097-300x300.webp 300w, https://blog.finxter.com/wp-content/uploads/2023/11/4f57fb9c-c42f-4554-a086-2f2112f8c097-150x150.webp 150w, https://blog.finxter.com/wp-content/uploads/2023/11/4f57fb9c-c42f-4554-a086-2f2112f8c097-768x768.webp 768w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /></figure>
</div>


<p>Go ahead and save this <code>chapters.py</code> file. </p>



<p>Now, create a new file in the <code>2_JSON_mode_and_seeds</code> folder called <code>json_mode.py</code>.</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />FINX_OPENAI_UPDATES (root project folder)
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />1_Parallel_function_calling
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />2_JSON_mode_and_seeds
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />chapters.py
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />json_mode.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />.env</pre>



<p>Inside, let&#8217;s get started with our imports:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">from decouple import config
from openai import OpenAI
from chapters import table_of_contents
import json
import pprint

client = OpenAI(api_key=config("OPENAI_API_KEY"))</pre>



<p>We have all our basic imports here, </p>



<ul class="wp-block-list">
<li><code>config</code>, </li>



<li>OpenAI, </li>



<li>the <code>table_of_contents</code> variable we just defined, </li>



<li><code>json</code>, and </li>



<li><code>pprint</code>. </li>
</ul>



<p>We&#8217;ll use <code><a href="https://blog.finxter.com/a-simple-guide-to-the-pprint-module-in-python/" data-type="post" data-id="508127">pprint</a></code>, or pretty print to print the output in a nice way. It will print objects like <a href="https://blog.finxter.com/how-to-print-a-dictionary-without-brackets-in-python/" data-type="post" data-id="34658">dictionaries</a> to the console in a much more readable manner, as you&#8217;ll see later. </p>



<p>We then initialize our <code>client</code> as before.</p>



<p>Now, let&#8217;s start our <code>json_gpt</code> function:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">def json_gpt(query, model="gpt-3.5-turbo-1106", system_message=None):
    if not system_message:
        system_message = "You are a JSON generator which outputs JSON objects according to user request"</pre>



<p>We&#8217;re going to be using 3.5-turbo the new version for this one, don&#8217;t worry, we&#8217;ll get to 4-turbo very soon! </p>



<p>But for now, it&#8217;s simply not needed to get good results and as 3.5 turbo is much cheaper it&#8217;s better to use it when 4 is not needed. </p>



<p><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f449.png" alt="👉" class="wp-smiley" style="height: 1em; max-height: 1em;" /> <em>More on pricing details later. </em></p>



<p>Again make sure you have the 1106 version and not any older one because only the newest GPT3.5 turbo and GPT4 turbo versions support JSON mode.</p>



<p>We define our function and set a default for the model and system message but allow the user to overwrite either. Still, inside the function, define the <code>messages</code> list:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">    messages = [
        {"role": "system", "content": system_message},
        {
            "role": "user",
            "content": f"Please return Json for the following as instructed above:\n{query}",
        },
    ]</pre>



<p>Note that the user query is preceded by a specific request for JSON output even in the user message. </p>



<p></p>



<p>Even though we will enable JSON mode, we still have to specifically mention the word JSON in the user message. </p>



<p>If we don&#8217;t the model may create weird generations which is actually why a <code>failsafe</code> error will be returned if we forget to include this word in our context.</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">    response = client.chat.completions.create(
        model=model,
        messages=messages,
        response_format={"type": "json_object"},
    )</pre>



<p>Now we make a pretty normal request to ChatGPT using the new client syntax.</p>



<p>Note we cannot just set the <code>response_format</code> variable to <code>json_object</code>, but we have to specifically pass in a dictionary with the key-value pair <code>"type": "json_object"</code>.</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">    content: str = response.choices[0].message.content
    content: dict = json.loads(content)
    print(f"\033[94m {type(content)} \033[0m")
    pprint.pprint(content)
    return content</pre>



<p>The content is initially in string format even though it represents JSON. </p>



<p>We then <a href="https://blog.finxter.com/how-to-convert-a-string-to-a-dictionary-in-python/" data-type="link" data-id="https://blog.finxter.com/how-to-convert-a-string-to-a-dictionary-in-python/">convert it to a dictionary</a> so we can work with the data like any other dictionary. </p>



<p>Note that whatever format you want with whatever keynames and values is possible, as we&#8217;ll demonstrate later. </p>



<p>We then print the type of content to show that ChatGPT&#8217;s output is, in fact, a valid dictionary object (after conversion from JSON) and pretty print it to the console. </p>



<p>Finally, we return the content so we can use it in our code.</p>



<p>The whole function is as follows:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">def json_gpt(query, model="gpt-3.5-turbo-1106", system_message=None):
    if not system_message:
        system_message = "You are a JSON generator which outputs JSON objects according to user request"

    messages = [
        {"role": "system", "content": system_message},
        {
            "role": "user",
            "content": f"Please return Json for the following as instructed above:\n{query}",
        },
    ]

    response = client.chat.completions.create(
        model=model,
        messages=messages,
        response_format={"type": "json_object"},
    )

    content: str = response.choices[0].message.content
    content: dict = json.loads(content)
    print(f"\033[94m {type(content)} \033[0m")
    pprint.pprint(content)
    return content</pre>



<h2 class="wp-block-heading">A Simple Test</h2>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="1024" height="1024" src="https://blog.finxter.com/wp-content/uploads/2023/11/462f4d49-2ca0-4d10-a58f-efdc0e6bfe99.webp" alt="" class="wp-image-1653294" srcset="https://blog.finxter.com/wp-content/uploads/2023/11/462f4d49-2ca0-4d10-a58f-efdc0e6bfe99.webp 1024w, https://blog.finxter.com/wp-content/uploads/2023/11/462f4d49-2ca0-4d10-a58f-efdc0e6bfe99-300x300.webp 300w, https://blog.finxter.com/wp-content/uploads/2023/11/462f4d49-2ca0-4d10-a58f-efdc0e6bfe99-150x150.webp 150w, https://blog.finxter.com/wp-content/uploads/2023/11/462f4d49-2ca0-4d10-a58f-efdc0e6bfe99-768x768.webp 768w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /></figure>
</div>


<p>Let&#8217;s start with a very simple test by adding the following <code>print</code> statement:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">json_gpt(
    "Give me a Json object with the height in cm and age in years of all people in the following text: John is 6 feet tall and 500 months old. Mary is 5 feet tall and 30 years old. Bob is 170cm in length and was born 25 years ago."
)</pre>



<p>And we can see it does absolutely fine and converts all the ages and heights to the same units just like we requested, even using 3.5-Turbo.</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">{'people': [{'age_years': 41.67, 'height_cm': 182.88, 'name': 'John'},
            {'age_years': 30, 'height_cm': 152.4, 'name': 'Mary'},
            {'age_years': 25, 'height_cm': 170, 'name': 'Bob'}]}</pre>



<p>This is a valid dictionary that we can straight away manipulate in our code or store in a database without having to do any additional parsing though we could round out the values if we wanted to. </p>



<p>So this can be used for data extraction, even if the values are given in different units or formats, interweaved in a piece of text. Also notice that the <code>pprint</code> function made it nice and easy to read by lining up the values in the dictionary.</p>



<h2 class="wp-block-heading">A More Complex Test</h2>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="1024" height="1024" src="https://blog.finxter.com/wp-content/uploads/2023/11/0a708efb-1864-4a49-b6c6-7caa15456e2f.webp" alt="" class="wp-image-1653295" srcset="https://blog.finxter.com/wp-content/uploads/2023/11/0a708efb-1864-4a49-b6c6-7caa15456e2f.webp 1024w, https://blog.finxter.com/wp-content/uploads/2023/11/0a708efb-1864-4a49-b6c6-7caa15456e2f-300x300.webp 300w, https://blog.finxter.com/wp-content/uploads/2023/11/0a708efb-1864-4a49-b6c6-7caa15456e2f-150x150.webp 150w, https://blog.finxter.com/wp-content/uploads/2023/11/0a708efb-1864-4a49-b6c6-7caa15456e2f-768x768.webp 768w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /></figure>
</div>


<p>Make sure you comment out the <code>print</code> statement above and then let&#8217;s use our table of contents file and give it a very specific output format, so we can basically use GPT as a data parser without having to write a real output parser.</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">json_gpt(
    query=table_of_contents,
    system_message="""
    You are a JSON generator which outputs JSON objects according to user request.
    Please extract the author and title for all lines going all the way from start to end in the following text and return it as a JSON object following the example provided below.

    Example input:
    The Lily of Liddisdale,                         _Professor Wilson_

    The Unlucky Present,                            _Robert Chambers_

    The Sutor of Selkirk                            “_The Odd Volume_,”

    Example output:
    {'contents': [
        {'author': 'Professor Wilson', 'title': 'The Lily of Liddisdale'},
        {'author': 'Robert Chambers', 'title': 'The Unlucky Present'},
        {'author': 'The Odd Volume', 'title': 'The Sutor of Selkirk'},
    ]}
    """,
)</pre>



<p>Note that the only guarantee we get with JSON mode is JSON output, not the specific format! </p>



<p>We still have the responsibility to be very specific to get the output we desire. Providing specific examples like the above is your best friend, as GPT tends to perform much better this way.</p>



<p>Now go ahead and run the file and you should get the following:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">gtp3_5 = {
    "contents": [
        {"author": "John Mackay Wilson", "title": "The Henpecked Man"},
        {"author": "James Hogg", "title": "Duncan Campbell"},
        ... many many more entries in between ...
        {"author": "James Paterson", "title": "The Fight for the Standard"},
        {"author": "D. M. Moir", "title": "Catching a Tartar"},
    ]
}</pre>



<p>Notice that it followed our example perfectly. It also got rid of the pesky extra quotes and underscores that appeared on the entries. This is just 3.5 Turbo, we haven&#8217;t even tried <a href="https://blog.finxter.com/gpt-4-turbo/" data-type="post" data-id="1652747">GPT-4 Turbo</a> yet! </p>



<p>If you do have something harder to parse, try GPT-4 Turbo, and it will do a better job. But in this case, <a href="https://blog.finxter.com/fine-tuning-gpt-3-5-turbo-how-to-craft-your-own-proprietary-llm/" data-type="post" data-id="1650906">3.5 Turbo</a> was more than enough to get the job done.</p>



<p>So yeah, that&#8217;s JSON mode, pretty darn cool and useful.</p>



<p>Have ChatGPT extract structured data for you from any text, and return it in an object format that doesn&#8217;t require any complex parsing, or even use ChatGPT as a parser without having to write a real parser to account for all the edge cases. </p>



<p>It&#8217;s pretty clever at handling even unforeseen edge cases as long as you provide a solid example of the end output you want.</p>



<h2 class="wp-block-heading">The Seed Parameter</h2>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="1024" height="1024" src="https://blog.finxter.com/wp-content/uploads/2023/11/6d48f25f-e992-4394-834f-e07071298a88.webp" alt="" class="wp-image-1653281" srcset="https://blog.finxter.com/wp-content/uploads/2023/11/6d48f25f-e992-4394-834f-e07071298a88.webp 1024w, https://blog.finxter.com/wp-content/uploads/2023/11/6d48f25f-e992-4394-834f-e07071298a88-300x300.webp 300w, https://blog.finxter.com/wp-content/uploads/2023/11/6d48f25f-e992-4394-834f-e07071298a88-150x150.webp 150w, https://blog.finxter.com/wp-content/uploads/2023/11/6d48f25f-e992-4394-834f-e07071298a88-768x768.webp 768w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /></figure>
</div>


<p>Go ahead and save and close this file and now let&#8217;s look at the <a href="https://blog.finxter.com/dall%c2%b7e-3-trick-using-seeds-to-recreate-the-same-image/" data-type="post" data-id="1652562">Seed parameter</a>. Create a new file called <code>seed_param.py</code>:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />FINX_OPENAI_UPDATES (root project folder)
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />1_Parallel_function_calling
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />2_JSON_mode_and_seeds
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />chapters.py
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />json_mode.py
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />seed_param.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />.env</pre>



<p>Now the idea behind seed parameters is of course that they can make some type of random generator predictable, provided you pass in the same seed, like generating the same Minecraft world by copying the seed from a friend. </p>



<p>While ChatGPT can now use a <code>seed</code> parameter, the very nondeterministic nature of ChatGPT means that it&#8217;s not quite a 100% guarantee, but the answers are definitely more similar and predictable than without a seed, so let&#8217;s check it out.</p>



<p>Inside the <code>seed_param.py</code> file go ahead and start with our imports and basic setup:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">from decouple import config
from openai import OpenAI

client = OpenAI(api_key=config("OPENAI_API_KEY"))</pre>



<p>This should be fairly familiar by now. </p>



<p>Now let&#8217;s code up a very simple printing utility to help us clean our code by cutting out the repetitive stuff:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">def consistency_printer(response):
    response_content = response.choices[0].message.content
    system_fingerprint = response.system_fingerprint
    print(f"\033[94m {response_content} \033[0m")
    print(f"\033[92m {system_fingerprint} \033[0m")</pre>



<p>What this function will do is receive the response we get from ChatGPT, extract the message&#8217;s content and the system fingerprint, and print them to the console in respectively blue and green colors. </p>



<p><strong><em>So what is the system fingerprint?</em></strong></p>



<p>The system fingerprint, as the name implies, identifies the exact backend configuration that the model works with. This system fingerprint will change if you change the request parameters or if OpenAI updates the models in some way behind the screens, which is likely to happen a couple of times per year. </p>



<p>If these fingerprints are the same, therefore, it means that both your configuration and the remote configuration are the same between both requests.</p>



<p>When we make concurrent requests in a moment, you&#8217;ll notice this fingerprint is basically always the same, but if you have a model run for months it is likely the backend configuration on OpenAI&#8217;s end will change at some point which will affect determinism and therefore the output.</p>



<p>Simply said, as long as the fingerprint and the seed remain the same between calls, the output should be similar or even the same.</p>



<h2 class="wp-block-heading">Bedtime Stories</h2>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="1024" height="1024" src="https://blog.finxter.com/wp-content/uploads/2023/11/2c4f809c-0a77-4a46-ad56-59784ec8f975.webp" alt="" class="wp-image-1653296" srcset="https://blog.finxter.com/wp-content/uploads/2023/11/2c4f809c-0a77-4a46-ad56-59784ec8f975.webp 1024w, https://blog.finxter.com/wp-content/uploads/2023/11/2c4f809c-0a77-4a46-ad56-59784ec8f975-300x300.webp 300w, https://blog.finxter.com/wp-content/uploads/2023/11/2c4f809c-0a77-4a46-ad56-59784ec8f975-150x150.webp 150w, https://blog.finxter.com/wp-content/uploads/2023/11/2c4f809c-0a77-4a46-ad56-59784ec8f975-768x768.webp 768w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /></figure>
</div>


<p>So let&#8217;s code up a very simple function that outputs something very nondeterministic, like bedtime stories!</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">def bedtime_stories(query, seed=None, model="gpt-3.5-turbo-1106"):
    messages = [
        {
            "role": "system",
            "content": "You make up fun children's stories according to the user request. The stories are only 100 characters long.",
        },
        {"role": "user", "content": query},
    ]

    response = client.chat.completions.create(
        model=model,
        messages=messages,
        seed=seed,
        temperature=0.7,
        stop=["\n"],
    )
    consistency_printer(response)</pre>



<p>We set up a very simple system message and then pass in the user query in the second message entry. </p>



<p>We call the GPT-3.5 Turbo model, again making sure to use the new 1106 version as older models don&#8217;t support the <code>seed</code> parameter, and we pass in the messages and the seed. </p>



<p>We also set the <code>temperature</code> to 0.7 and the <code>stop</code> parameter to a newline character so we don&#8217;t get a huge wall of text. </p>



<p>The <code>stop</code> parameter simply means that the model will stop generating text when it encounters a newline character, limiting the length of the output we need to compare.</p>



<h2 class="wp-block-heading">Testing the Seed Parameter with Bedtime Stories</h2>



<p>Now let&#8217;s add a <code>print</code> statement and run 3 calls without a <code>seed</code> parameter:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">for i in range(3):
    bedtime_stories(
        "Tell me a story about a unicorn in space.",
    )</pre>



<p>Go ahead and run it. </p>



<p>Note how the unicorn has a different name in every single story, and the stories are quite different:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">Once upon a time, a unicorn named Luna soared through the galaxy, spreading stardust and kindness wherever she went.
fp_eeff13170a

Once upon a time, a brave unicorn named Stardust soared through the galaxy, spreading magic and joy to all the stars.
fp_eeff13170a

Once upon a time, a unicorn named Nova flew through space, sprinkling stardust and bringing light to dark corners.
fp_eeff13170a</pre>



<p>Now change the <code>print</code> statement like this, and run it again:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">for i in range(3):
    bedtime_stories(
        "Tell me a story about a unicorn in space.",
        seed=2424,
    )</pre>



<p>Note that the seed can be an arbitrary number, we chose 2424 at random. If we run this we get:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">Once upon a time, a magical unicorn flew through space, sprinkling stardust on planets and making new friends.
fp_eeff13170a

Once upon a time, a magical unicorn flew through space, sprinkling stardust on planets and making new friends.
fp_eeff13170a

Once upon a time, a magical unicorn soared through space, sprinkling stardust on planets and granting wishes to lonely stars.
fp_eeff13170a</pre>



<p>We can see they are not quite the same. The first and second ones are identical but the third is similar but different. If you run this several times you&#8217;ll sometimes get 3 of the same outputs, and sometimes they&#8217;ll all be different. </p>



<p>This is because the <code>seed</code> parameter is not a 100% guarantee, but it does make the output more consistent and similar.</p>



<p>You might think that the <code>temperature</code> setting of 0.7 is the culprit, but this is not the problem. Setting it to 0 does not make much difference in this case.</p>



<p>If we swap out our function&#8217;s default 3.5 Turbo model for GPT-4 Turbo (more on GPT-4 Turbo in the next part):</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">for i in range(3):
    bedtime_stories(
        "Tell me a story about a unicorn in space.",
        seed=2424,
        model="gpt-4-1106-preview",
    )</pre>



<p>We see a similar story:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">Star Unicorn zooms, finds a comet friend. Together, they race across the Milky Way! <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f984.png" alt="🦄" class="wp-smiley" style="height: 1em; max-height: 1em;" /><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/2728.png" alt="✨" class="wp-smiley" style="height: 1em; max-height: 1em;" /><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f680.png" alt="🚀" class="wp-smiley" style="height: 1em; max-height: 1em;" /><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f320.png" alt="🌠" class="wp-smiley" style="height: 1em; max-height: 1em;" />
fp_a24b4d720c

Star Unicorn zooms, finds a comet friend. Cosmic races begin!
fp_a24b4d720c

Star Unicorn zooms, finds a comet friend. Together, they race across the Milky Way! <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f984.png" alt="🦄" class="wp-smiley" style="height: 1em; max-height: 1em;" /><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/2728.png" alt="✨" class="wp-smiley" style="height: 1em; max-height: 1em;" /><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f680.png" alt="🚀" class="wp-smiley" style="height: 1em; max-height: 1em;" /><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f320.png" alt="🌠" class="wp-smiley" style="height: 1em; max-height: 1em;" />
fp_a24b4d720c</pre>



<p>Very similar, and the unicorn has the same name, but the last part is different in the middle generation. Just know that the seed parameter provides no guarantees.</p>


<div class="wp-block-image">
<figure class="aligncenter size-large"><a href="https://blog.finxter.com/dall%c2%b7e-3-trick-using-seeds-to-recreate-the-same-image/" target="_blank" rel="noreferrer noopener"><img loading="lazy" decoding="async" width="1024" height="571" src="https://blog.finxter.com/wp-content/uploads/2023/11/image-210-1-1024x571.png" alt="" class="wp-image-1653303" srcset="https://blog.finxter.com/wp-content/uploads/2023/11/image-210-1-1024x571.png 1024w, https://blog.finxter.com/wp-content/uploads/2023/11/image-210-1-300x167.png 300w, https://blog.finxter.com/wp-content/uploads/2023/11/image-210-1-768x428.png 768w, https://blog.finxter.com/wp-content/uploads/2023/11/image-210-1.png 1290w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /></a></figure>
</div>


<p class="has-base-2-background-color has-background"><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f517.png" alt="🔗" class="wp-smiley" style="height: 1em; max-height: 1em;" /> <strong>Recommended</strong>: <a href="https://blog.finxter.com/dall%c2%b7e-3-trick-using-seeds-to-recreate-the-same-image/" data-type="link" data-id="https://blog.finxter.com/dall%c2%b7e-3-trick-using-seeds-to-recreate-the-same-image/">DALL·E 3 Trick: Using Seeds to Recreate the Same Image</a></p>



<h2 class="wp-block-heading">Fruitclopedia, More Deterministic Questions</h2>



<p>So let&#8217;s try with something a little more stable, like fruits. </p>



<p class="has-base-2-background-color has-background"><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1fad0.png" alt="🫐" class="wp-smiley" style="height: 1em; max-height: 1em;" /> <strong>Fruits</strong>: Where children&#8217;s stories can be about literally everything and therefore there is no definition as to what ChatGPT should be outputting, fruits are quite predictable. Asking about a Pineapple is a very concrete question and not open to artistic interpretation as to what the answer should be.</p>



<p>We have a very basic function, just copy this:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">def fruit_gpt(query, seed=None, temperature=0.2):
    messages = [
        {
            "role": "system",
            "content": "You are the fruitclopedia. Users name a fruit and you give information.",
        },
        {"role": "user", "content": query},
    ]

    response = client.chat.completions.create(
        model="gpt-3.5-turbo-1106",
        messages=messages,
        seed=seed,
        temperature=temperature,
        stop=["\n"],
    )
    consistency_printer(response)</pre>



<p>It is basically the same but the temperature has been set to 0.2 for this one. </p>



<p>We still use the <code>stop</code> parameter to limit the output length to one paragraph, so when the model inserts a newline to go to the next paragraph, it will stop generating text as it hits our stop condition.</p>



<h2 class="wp-block-heading">Testing the Seed Parameter with Fruitclopedia</h2>



<p>Running this without a seed:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">for i in range(3):
    fruit_gpt(
        "Grapefruit.",
        temperature=0,
    )</pre>



<p>And we can interestingly see that they start quite the same but then diverge:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">Grapefruit is a subtropical citrus fruit known for its sour to semi-sweet taste. It is a hybrid of the sweet orange and the pomelo. Grapefruits are rich in vitamins C and A, and they also contain fiber and antioxidants. They are often enjoyed fresh, juiced, or added to salads and desserts. There are different varieties of grapefruit, including white, pink, and red, each with its own unique flavor profile.
fp_eeff13170a

Grapefruit is a subtropical citrus fruit known for its sour to semi-sweet taste. It is a hybrid of the sweet orange and the pomelo, and it is typically larger than an orange with a thicker rind. Grapefruits are rich in vitamins C and A, as well as antioxidants. They are often enjoyed fresh, juiced, or added to salads and desserts. There are different varieties of grapefruit, including white, pink, and red, each with its own unique flavor profile.
fp_eeff13170a

Grapefruit is a subtropical citrus fruit known for its sour to semi-sweet taste. It is a hybrid of the pomelo and the sweet orange. Grapefruits are rich in vitamins C and A, as well as dietary fiber. They are often enjoyed fresh, juiced, or added to salads and desserts. There are different varieties of grapefruit, including white, pink, and red, each with its own unique flavor profile.
fp_eeff13170a</pre>



<p>This is not so much because we set the temperature to 0.2 but more that our question is much more specific. </p>



<p>Tell me a children&#8217;s story about a unicorn could have a million answers, all of which are correct. The number of correct answers for basic info about Pineapples is limited.</p>



<p>So let&#8217;s try this with a seed, which is where the <code>seed</code> parameter really shines:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">for i in range(3):
    fruit_gpt(
        "Grapefruit.",
        seed=123,
        temperature=0,
    )</pre>



<p>As you can see below, the answers are now 100% identical!</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">Grapefruit is a subtropical citrus fruit known for its slightly bitter and sour taste. It is a hybrid of the pomelo and the sweet orange. Grapefruits are rich in vitamins C and A, as well as dietary fiber. They are often enjoyed fresh, juiced, or added to fruit salads. There are different varieties of grapefruit, including white, pink, and red, each with its own unique flavor profile.
fp_eeff13170a

Grapefruit is a subtropical citrus fruit known for its slightly bitter and sour taste. It is a hybrid of the pomelo and the sweet orange. Grapefruits are rich in vitamins C and A, as well as dietary fiber. They are often enjoyed fresh, juiced, or added to fruit salads. There are different varieties of grapefruit, including white, pink, and red, each with its own unique flavor profile.
fp_eeff13170a

Grapefruit is a subtropical citrus fruit known for its slightly bitter and sour taste. It is a hybrid of the pomelo and the sweet orange. Grapefruits are rich in vitamins C and A, as well as dietary fiber. They are often enjoyed fresh, juiced, or added to fruit salads. There are different varieties of grapefruit, including white, pink, and red, each with its own unique flavor profile.
fp_eeff13170a</pre>



<p>However, remember that this is not guaranteed 100%! You will see variation if you run this multiple times. If you use this to write tests for your application you should make sure to include the fingerprint, because if OpenAI updates the system configuration on their end, the output will change. Also make multiple calls and pass the test if one of them matches.</p>



<p>So yeah, that&#8217;s the <code>seed</code> parameter. </p>



<p>Pretty reliable but not guaranteed, as long as you ask somewhat focused questions. If you ask something very open-ended it will still be more similar but less effective. </p>



<p>That&#8217;s it for <a href="https://academy.finxter.com/course/openai-json-mode-and-seeds/" data-type="link" data-id="https://academy.finxter.com/course/openai-json-mode-and-seeds/">part 2</a>. In the <a href="https://academy.finxter.com/course/openai-gpt-4-turbo/" data-type="link" data-id="https://academy.finxter.com/course/openai-gpt-4-turbo/">next part</a>, we&#8217;ll look at GPT-4 Turbo and it&#8217;s really exciting new abilities like vision! See you there!</p>



<p class="has-base-2-background-color has-background"><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f449.png" alt="👉" class="wp-smiley" style="height: 1em; max-height: 1em;" /> <a href="https://academy.finxter.com/university/openai-update/" data-type="link" data-id="https://academy.finxter.com/university/openai-update/" target="_blank" rel="noreferrer noopener">Take Me Back to the Full Course</a></p>


<div class="wp-block-image">
<figure class="aligncenter size-full"><a href="https://academy.finxter.com/university/openai-update/"><img loading="lazy" decoding="async" width="805" height="530" src="https://blog.finxter.com/wp-content/uploads/2023/11/image-130.png" alt="" class="wp-image-1653284" srcset="https://blog.finxter.com/wp-content/uploads/2023/11/image-130.png 805w, https://blog.finxter.com/wp-content/uploads/2023/11/image-130-300x198.png 300w, https://blog.finxter.com/wp-content/uploads/2023/11/image-130-768x506.png 768w" sizes="auto, (max-width: 805px) 100vw, 805px" /></a></figure>
</div>


<p class="has-base-2-background-color has-background"><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f449.png" alt="👉" class="wp-smiley" style="height: 1em; max-height: 1em;" /> <strong>Full Course</strong>: <a href="https://academy.finxter.com/university/openai-update/" data-type="link" data-id="https://academy.finxter.com/university/openai-update/">OpenAI API Mastery: Innovating with GPT-4 Turbo, Text-to-Speech (TTS), and DALL·E 3</a></p>
<p>The post <a href="https://blog.finxter.com/openai-json-mode-dall%c2%b7e-3-seeds-examples/">OpenAI JSON Mode &#038; DALL·E 3 Seeds (Examples)</a> appeared first on <a href="https://blog.finxter.com">Be on the Right Side of Change</a>.</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>OpenAI Parallel Function Calling (WeatherGPT Example)</title>
		<link>https://blog.finxter.com/openai-parallel-function-calling-weathergpt-example/</link>
		
		<dc:creator><![CDATA[Dirk van Meerveld]]></dc:creator>
		<pubDate>Mon, 27 Nov 2023 21:48:19 +0000</pubDate>
				<category><![CDATA[Artificial Intelligence]]></category>
		<category><![CDATA[Large Language Model (LLM)]]></category>
		<category><![CDATA[OpenAI]]></category>
		<category><![CDATA[Python]]></category>
		<guid isPermaLink="false">https://blog.finxter.com/?p=1653283</guid>

					<description><![CDATA[<p>Welcome to part 1 of the course! My name is Dirk van Meerveld and I will be your host and guide for this series in which we will be exploring all the new features of the OpenAI APIs and what we can do with them. To get started, we&#8217;re going to be looking at the ... <a title="OpenAI Parallel Function Calling (WeatherGPT Example)" class="read-more" href="https://blog.finxter.com/openai-parallel-function-calling-weathergpt-example/" aria-label="Read more about OpenAI Parallel Function Calling (WeatherGPT Example)">Read more</a></p>
<p>The post <a href="https://blog.finxter.com/openai-parallel-function-calling-weathergpt-example/">OpenAI Parallel Function Calling (WeatherGPT Example)</a> appeared first on <a href="https://blog.finxter.com">Be on the Right Side of Change</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<p>Welcome to part 1 of the <a href="https://academy.finxter.com/university/openai-update/" data-type="link" data-id="https://academy.finxter.com/university/openai-update/">course</a>! My name is Dirk van Meerveld and I will be your host and guide for this series in which we will be exploring all the new features of the OpenAI APIs and what we can do with them.</p>



<p class="has-global-color-8-background-color has-background">To get started, we&#8217;re going to be looking at the <strong>OpenAI function calling updates</strong>, especially the new ability to call multiple functions in parallel. We&#8217;ll also discuss some of the important syntax changes to go along with this and other new functionality in the API.</p>



<p><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f517.png" alt="🔗" class="wp-smiley" style="height: 1em; max-height: 1em;" /> <em>This <a href="https://academy.finxter.com/1-6-openai-api-mastery-innovating-with-gpt-4-turbo-and-dall%c2%b7e-3-parallel-function-calling/" data-type="link" data-id="https://academy.finxter.com/1-6-openai-api-mastery-innovating-with-gpt-4-turbo-and-dall%c2%b7e-3-parallel-function-calling/">article</a> originally appeared on the Finxter Academy for premium members (including course lesson video). <a href="https://academy.finxter.com/university/openai-update/" data-type="link" data-id="https://academy.finxter.com/university/openai-update/">Check out the video course here</a>.</em></p>



<p>Let&#8217;s create a new folder and file in our base directory to get started.</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />FINX_OPENAI_UPDATES (root project folder)
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />1_Parallel_function_calling
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />function_descriptions.py</pre>



<h2 class="wp-block-heading">Function Descriptions</h2>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="1024" height="1024" src="https://blog.finxter.com/wp-content/uploads/2023/11/d274d91a-957f-47f3-b10b-b89357da4088.webp" alt="" class="wp-image-1653292" srcset="https://blog.finxter.com/wp-content/uploads/2023/11/d274d91a-957f-47f3-b10b-b89357da4088.webp 1024w, https://blog.finxter.com/wp-content/uploads/2023/11/d274d91a-957f-47f3-b10b-b89357da4088-300x300.webp 300w, https://blog.finxter.com/wp-content/uploads/2023/11/d274d91a-957f-47f3-b10b-b89357da4088-150x150.webp 150w, https://blog.finxter.com/wp-content/uploads/2023/11/d274d91a-957f-47f3-b10b-b89357da4088-768x768.webp 768w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /></figure>
</div>


<p>Open up <code>function_descriptions.py</code>. </p>



<p>Here we&#8217;ll describe the <a href="https://blog.finxter.com/openai-api-functions-embeddings-course-1-7-simple-function-request/" data-type="post" data-id="1651043">functions</a>. These objects are for ChatGPT to know what functions are available and what names they have. </p>



<p>It describes what the function does and what parameters it needs as input. Notice that this is all text.</p>



<p class="has-global-color-8-background-color has-background"><strong>The only purpose of these objects is for ChatGPT to know what functions are available, when it should use a particular function, and what arguments it needs to provide to call a specific function.</strong> As such, they are not the functions themselves, which we have separately, but merely a description of the functions. </p>



<p>Let&#8217;s get started:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">describe_get_current_weather = {
    "type": "function",
    "function": {</pre>



<p>Note the syntax has slightly changed from what function calling used to be. We now wrap the entire object inside a <code>"function"</code> key and also have a <code>"type": "function"</code> key-value pair on the outermost level.</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">        "name": "get_current_weather",
        "description": "This function provides the current weather in a specific location.",</pre>



<p>The name we provide here is the name that ChatGPT will use when it wants to call this particular function.</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">        "parameters": {</pre>



<p>Here you describe to ChatGPT when you want it to call this function, and what the purpose of this function is.</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">            "type": "object",</pre>



<p>Here, you describe what parameters this function needs to be able to run.</p>



<p>The overall parameters are an object, and as properties, it needs a location which is of type string. We also provide a description of what this parameter should contain, namely the name of a city. </p>



<p></p>



<p>Note the required key, which is an array of the required parameters (you can specify multiple parameters here).</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">            "properties": {
                "location": {
                    "type": "string",
                    "description": "The location as a city name, e.g. Amsterdam.",
                },
            },
            "required": ["location"],
        },
    },
}</pre>



<p>So the whole description is :</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">describe_get_current_weather = {
    "type": "function",
    "function": {
        "name": "get_current_weather",
        "description": "This function provides the current weather in a specific location.",
        "parameters": {
            "type": "object",
            "properties": {
                "location": {
                    "type": "string",
                    "description": "The location as a city name, e.g. Amsterdam.",
                },
            },
            "required": ["location"],
        },
    },
}</pre>



<p>Now we have the second one, which is much the same:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">describe_get_weather_forecast = {
    "type": "function",
    "function": {
        "name": "get_weather_forecast",
        "description": "This function provides the weather forecast in a specific location for a specified number of days.",</pre>



<p>Here, we have multiple parameters. Note that only one of them is required.</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">        "parameters": {
            "type": "object",
            "properties": {
                "location": {
                    "type": "string",
                    "description": "The location as a city name, e.g. Amsterdam.",
                },
                "days": {
                    "type": "integer",
                    "description": "The number of days to forecast, between 1 and 14.",
                },
            },
            "required": ["location"],
        },
    },
}</pre>



<p>The entire <code>function_descriptions.py</code> file now looks like this:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">describe_get_current_weather = {
    "type": "function",
    "function": {
        "name": "get_current_weather",
        "description": "This function provides the current weather in a specific location.",
        "parameters": {
            "type": "object",
            "properties": {
                "location": {
                    "type": "string",
                    "description": "The location as a city name, e.g. Amsterdam.",
                },
            },
            "required": ["location"],
        },
    },
}


describe_get_weather_forecast = {
    "type": "function",
    "function": {
        "name": "get_weather_forecast",
        "description": "This function provides the weather forecast in a specific location for a specified number of days.",
        "parameters": {
            "type": "object",
            "properties": {
                "location": {
                    "type": "string",
                    "description": "The location as a city name, e.g. Amsterdam.",
                },
                "days": {
                    "type": "integer",
                    "description": "The number of days to forecast, between 1 and 14.",
                },
            },
            "required": ["location"],
        },
    },
}</pre>



<h2 class="wp-block-heading">Prompt Setup</h2>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="1024" height="1024" src="https://blog.finxter.com/wp-content/uploads/2023/11/4f57fb9c-c42f-4554-a086-2f2112f8c097.webp" alt="" class="wp-image-1653293" srcset="https://blog.finxter.com/wp-content/uploads/2023/11/4f57fb9c-c42f-4554-a086-2f2112f8c097.webp 1024w, https://blog.finxter.com/wp-content/uploads/2023/11/4f57fb9c-c42f-4554-a086-2f2112f8c097-300x300.webp 300w, https://blog.finxter.com/wp-content/uploads/2023/11/4f57fb9c-c42f-4554-a086-2f2112f8c097-150x150.webp 150w, https://blog.finxter.com/wp-content/uploads/2023/11/4f57fb9c-c42f-4554-a086-2f2112f8c097-768x768.webp 768w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /></figure>
</div>


<p>Ok, go ahead and close that and create another file in the <code>'1_Parallel_function_calling'</code> folder called <code>'prompt_setup.py'</code>. This is where we&#8217;ll set up the prompt for ChatGPT to use.</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />FINX_OPENAI_UPDATES (root project folder)
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />1_Parallel_function_calling
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />function_descriptions.py
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />prompt_setup.py</pre>



<p>Inside, put the following variable:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">current_and_forecast_setup = "You are a regular ChatGPT chatbot, just like normal, however, you also have access to some functions that can be called if you need them. One will provide the current weather and one will provide the weather forecast. IF THE USER DOES NOT ASK A WEATHER RELATED QUESTION JUST ANSWER THEM AS NORMAL WITHOUT CALLING ANY FUNCTIONS."</pre>



<p>This is just a basic prompt setup telling the model it has functions available but also emphasizing that we don&#8217;t want to use them if they are not needed to answer the question. You can always play around with the specific wording and details of this prompt to see what works best for you.</p>



<p>We put it in a separate file to keep large string variables outside the main code to keep it readable as in a larger project the setup would likely be longer and have several versions.</p>



<h2 class="wp-block-heading">Weather API</h2>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="1024" height="1024" src="https://blog.finxter.com/wp-content/uploads/2023/11/462f4d49-2ca0-4d10-a58f-efdc0e6bfe99.webp" alt="" class="wp-image-1653294" srcset="https://blog.finxter.com/wp-content/uploads/2023/11/462f4d49-2ca0-4d10-a58f-efdc0e6bfe99.webp 1024w, https://blog.finxter.com/wp-content/uploads/2023/11/462f4d49-2ca0-4d10-a58f-efdc0e6bfe99-300x300.webp 300w, https://blog.finxter.com/wp-content/uploads/2023/11/462f4d49-2ca0-4d10-a58f-efdc0e6bfe99-150x150.webp 150w, https://blog.finxter.com/wp-content/uploads/2023/11/462f4d49-2ca0-4d10-a58f-efdc0e6bfe99-768x768.webp 768w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /></figure>
</div>


<p>Now save and close that file as well. It&#8217;s time to create the actual functions that we&#8217;re going to be giving to ChatGPT to call. Create a new file in the same folder called <code>'weather.py'</code>:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />FINX_OPENAI_UPDATES (root project folder)
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />1_Parallel_function_calling
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />function_descriptions.py
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />prompt_setup.py
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />weather.py</pre>



<p>First, sign up for a free account on <em>weatherapi.com</em>. </p>



<p>They will give you pro for 14 days for free but it will automatically switch back to free afterward and you don&#8217;t have to provide any payment or credit card information, so don&#8217;t worry about it, you can use this API for free without any hassle.</p>



<p>Now create a <code>'.env'</code> file in the base directory of your project:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />FINX_OPENAI_UPDATES (root project folder)
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />1_Parallel_function_calling
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />function_descriptions.py
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />prompt_setup.py
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />weather.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />.env</pre>



<p>And inside this file put both your <code>weatherapi</code> API key and <a href="https://blog.finxter.com/can-i-use-openai-api-commercially-yes-but-know-this/" data-type="post" data-id="1652737">OpenAI API key</a> using the following syntax, making sure not to use quotes or spaces:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">CHATGPT_API_KEY=supersecretchatgptapikeygoeshere
WEATHER_API_KEY=yoursupersecretweatherapikeygoeshere</pre>



<p>Close and save that file so we can load our secret API keys from this file later. Now open a terminal and run the following command:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">pip install python-decouple</pre>



<h2 class="wp-block-heading">Writing the Functions</h2>



<p>This library will allow us to load our API keys from the <code>.env</code> file we just created. Now open up <code>weather.py</code> (which is still empty) and put the following code inside:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">from decouple import config
from json import dumps
import requests</pre>



<p>Config will allow us to easily read the content of our <code>.env</code> file, allowing us to load our API keys without coding their values in our code. The <code>json</code> module is a part of Python&#8217;s standard library and provides methods for working with JSON data.</p>



<p>The <code>dumps</code> function is used to convert a Python object into a JSON string, which basically holds the same information but in a string format. This is useful as ChatGPT cannot take Python objects as input, but it can take strings.</p>



<p>Finally, the requests module is a Python library used for making <a href="https://blog.finxter.com/python-requests-library/" data-type="post" data-id="37796">HTTP requests</a>, providing us with a simple API. We&#8217;ll use it to send requests to the <em>weatherapi.com</em> API.</p>



<p>Now we define a simple function below:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">def get_current_weather(location) -> str:
    if not location:
        return (
            "Please provide a location and call the get_current_weather_function again."
        )
    API_params = {
        "key": config("WEATHER_API_KEY"),
        "q": location,
        "aqi": "no",
        "alerts": "no",
    }</pre>



<p>We use the <code>config</code> function to load the API key from the <code>.env</code> file. (make sure the name matches exactly and the <code>.env</code> file does not have any spaces). </p>



<p>Q holds the location, and AQI (air quality index) and alerts are not needed.</p>



<p>Continue inside the function:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">    response: requests.models.Response = requests.get(
        "http://api.weatherapi.com/v1/current.json", params=API_params
    )
    str_response: str = dumps(response.json())
    return str_response</pre>



<p>We make a get request, passing in our URL and parameters, and get a response object, which contains the server&#8217;s response. </p>



<p>We then convert the response to a dictionary by calling the <code>.json</code> method and convert this dictionary to a string using the <code>dumps</code> function we imported above.</p>



<p>This is the whole function:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">def get_current_weather(location) -> str:
    if not location:
        return (
            "Please provide a location and call the get_current_weather_function again."
        )
    API_params = {
        "key": config("WEATHER_API_KEY"),
        "q": location,
        "aqi": "no",
        "alerts": "no",
    }
    response: requests.models.Response = requests.get(
        "http://api.weatherapi.com/v1/current.json", params=API_params
    )
    str_response: str = dumps(response.json())
    return str_response</pre>



<h2 class="wp-block-heading">Testing</h2>



<p>Give it a quick test run to make sure it&#8217;s working. Add the print statement below and run your file:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">print(get_current_weather("Seoul"))</pre>



<p>You should see something like this in your terminal:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">{"location": {"name": "Seoul", "region": "", "country": "South Korea", "lat": 37.57, "lon": 127.0, "tz_id": "Asia/Seoul", "localtime_epoch": 1699705164, "localtime": "2023-11-11 21:19"}, "current": {"last_updated_epoch": 1699704900, "last_updated": "2023-11-11 21:15", "temp_c": 1.0, "temp_f": 33.8, "is_day": 0, "condition": {"text": "Clear", "icon": "//cdn.weatherapi.com/weather/64x64/night/113.png", "code": 1000}, "wind_mph": 6.9, "wind_kph": 11.2, "wind_degree": 330, "wind_dir": "NNW", "pressure_mb": 1029.0, "pressure_in": 30.39, "precip_mm": 0.0, "precip_in": 0.0, "humidity": 55, "cloud": 0, "feelslike_c": -3.1, "feelslike_f": 26.3, "vis_km": 10.0, "vis_miles": 6.0, "uv": 1.0, "gust_mph": 12.1, "gust_kph": 19.4}}</pre>



<p>Make sure you comment out the <code>print</code> statement so it won&#8217;t run every time we import this file in the future.</p>



<h2 class="wp-block-heading">Writing the Second Function</h2>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="1024" height="1024" src="https://blog.finxter.com/wp-content/uploads/2023/11/0a708efb-1864-4a49-b6c6-7caa15456e2f.webp" alt="" class="wp-image-1653295" srcset="https://blog.finxter.com/wp-content/uploads/2023/11/0a708efb-1864-4a49-b6c6-7caa15456e2f.webp 1024w, https://blog.finxter.com/wp-content/uploads/2023/11/0a708efb-1864-4a49-b6c6-7caa15456e2f-300x300.webp 300w, https://blog.finxter.com/wp-content/uploads/2023/11/0a708efb-1864-4a49-b6c6-7caa15456e2f-150x150.webp 150w, https://blog.finxter.com/wp-content/uploads/2023/11/0a708efb-1864-4a49-b6c6-7caa15456e2f-768x768.webp 768w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /></figure>
</div>


<p>Now we&#8217;ll create a second function to get the <a href="https://blog.finxter.com/how-i-built-a-weather-app-with-python-streamlit/" data-type="post" data-id="1413206">weather forecast</a>. </p>



<p>This one is a bit more complicated as we need to provide a number of days to forecast. We&#8217;ll also need to do some error handling to make sure the user provides a valid number of days.</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">def get_weather_forecast(location, days=7) -> str:
    try:
        days = 1 if days &lt; 1 else 14 if days > 14 else days
    except TypeError:
        days = 7</pre>



<p>We take a location and set a default of 7 days. </p>



<p>If the days variable is less than 1 we set it to 1, but if it&#8217;s more than 14 we set it to 14. If neither condition is true the user provided a valid value and we just use the input argument value. </p>



<p>Finally, if some weird type gets passed in we just default to 7 days.</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">    params = {
        "key": config("WEATHER_API_KEY"),
        "q": location,
        "days": days,
        "aqi": "no",
        "alerts": "no",
    }

    response: requests.models.Response = requests.get(
        "http://api.weatherapi.com/v1/forecast.json", params=params
    )</pre>



<p>Parameters are largely the same except we have a number of days now. The only problem is that the API sends back a lot of data, even hourly data so 24 entries per day, which is way too much, so we need to do some filtering:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">    response: dict = response.json()
    filtered_response = {}
    filtered_response["location"] = response["location"]
    filtered_response["current"] = response["current"]
    filtered_response["forecast"] = [
        [day["date"], day["day"]] for day in response["forecast"]["forecastday"]
    ]
    return dumps(filtered_response)</pre>



<p>First convert the response to a dictionary. Keep the location and the current weather by copying them from the response to the empty dictionary named <code>filtered_response</code> we just created.</p>



<p>For the forecast, we only want the daily data, as the hourly data will completely overload the response. The line just extracts only the data we want and is based on the structure of the response from the API.</p>



<p>I don&#8217;t want to get too deeply into it here as this course is on OpenAI and not on list comprehensions but basically, we extract the date and day data from each day in the forecast and put it in a list.</p>



<p>Finally, we convert the <code>filtered_response</code> dictionary to a string and return it, without all the hourly data that the API sent to us.</p>



<p>The second function now looks like this:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">def get_weather_forecast(location, days=7) -> str:
    try:
        days = 1 if days &lt; 1 else 14 if days > 14 else days
    except TypeError:
        days = 7

    params = {
        "key": config("WEATHER_API_KEY"),
        "q": location,
        "days": days,
        "aqi": "no",
        "alerts": "no",
    }

    response: requests.models.Response = requests.get(
        "http://api.weatherapi.com/v1/forecast.json", params=params
    )

    response: dict = response.json()
    filtered_response = {}
    filtered_response["location"] = response["location"]
    filtered_response["current"] = response["current"]
    filtered_response["forecast"] = [
        [day["date"], day["day"]] for day in response["forecast"]["forecastday"]
    ]
    return dumps(filtered_response)</pre>



<p>Give it a test run:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">print(get_weather_forecast("Seoul", days=3))</pre>



<p>And you should get a fairly large output in your terminal. Again, make sure you comment out the print statement so it won&#8217;t run every time we import this file in the future.</p>



<h2 class="wp-block-heading">Parallel Function Calling</h2>



<p>Ok go ahead and close your <code>weather.py</code> file. We&#8217;re done with it for now. Now we&#8217;ll create a new file called <code>'parallel_function_calling.py'</code> in the same folder:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />FINX_OPENAI_UPDATES (root project folder)
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c1.png" alt="📁" class="wp-smiley" style="height: 1em; max-height: 1em;" />1_Parallel_function_calling
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />function_descriptions.py
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />prompt_setup.py
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />weather.py
        <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />parallel_function_calling.py
    <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4c4.png" alt="📄" class="wp-smiley" style="height: 1em; max-height: 1em;" />.env</pre>



<p>Important! Before we get started make sure you run this in a terminal window:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">pip install openai --upgrade</pre>



<p>This gets the latest version of the <code>openai</code> library, to make sure your syntax is the same as mine, as there are quite some differences between the old and new versions, which we&#8217;ll be going over in the coming parts.</p>



<figure class="wp-block-image size-large"><a href="https://blog.finxter.com/how-to-install-openai-in-python/"><img loading="lazy" decoding="async" width="1024" height="575" src="https://blog.finxter.com/wp-content/uploads/2023/11/image-1-3-1024x575.png" alt="" class="wp-image-1653287" srcset="https://blog.finxter.com/wp-content/uploads/2023/11/image-1-3-1024x575.png 1024w, https://blog.finxter.com/wp-content/uploads/2023/11/image-1-3-300x168.png 300w, https://blog.finxter.com/wp-content/uploads/2023/11/image-1-3-768x431.png 768w, https://blog.finxter.com/wp-content/uploads/2023/11/image-1-3.png 1364w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /></a></figure>



<p class="has-base-2-background-color has-background"><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f517.png" alt="🔗" class="wp-smiley" style="height: 1em; max-height: 1em;" /> <strong>Recommended</strong>: <a href="https://blog.finxter.com/how-to-install-openai-in-python/">How to Install OpenAI in Python?</a></p>



<p>Open the <code>parallel_function_calling.py</code> file and let&#8217;s have some fun!</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">import json
from decouple import config
from openai import OpenAI
from typing import Callable</pre>



<p>We import the built-in <code>json</code> module to work with JSON data, config to load our OpenAI API key from the <code>.env</code> file, and OpenAI to access the API. </p>



<p>Note that the syntax is different, where we would just import <code>'openai'</code> itself in the past, in this new version of the <code>openai</code> library, we need to import OpenAI instead. The <code>'Callable'</code> from typing is just used to clear up something in our code later on.</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">from weather import get_current_weather, get_weather_forecast
from prompt_setup import current_and_forecast_setup
from function_descriptions import (
    describe_get_current_weather,
    describe_get_weather_forecast,
)</pre>



<p>Here we just import our own stuff we prepared ahead of time.</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">MODEL = "gpt-3.5-turbo-1106"
client = OpenAI(api_key=config("OPENAI_API_KEY"))</pre>



<p>Define the model up top, and then we create a <code>'client'</code> by calling the OpenAI class we imported and passing in the <code>api_key</code> by loading it from the <code>.env</code> file using <code>config</code>. </p>



<p>This is part of the new standard syntax, we will interact with this client object to make API calls to OpenAI&#8217;s various API endpoints from here on.</p>



<h2 class="wp-block-heading">Utility Printer Function</h2>



<p>Now create a quick utility to print the output in a more readable manner:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">def quick_dirty_printer(messages):
    """
    Prints messages in alternating colors (irrespective of role) and the final message in green. (92 is green, 93 is yellow, 94 is blue)
    """
    for index, message in enumerate(messages):
        if index == len(messages) - 1:
            print(f"\033[92m {message} \033[0m")
        elif index % 2 == 0:
            print(f"\033[93m {message} \033[0m")
        else:
            print(f"\033[94m {message} \033[0m")</pre>



<p>This function takes a list of messages and then loops over each index and message in the messages. If the index is the last one, it prints the message in green, otherwise, it prints it in alternating yellow and blue colors using the remainder operator to distinguish odd and even indexes. </p>



<p>This is just a quick and dirty way to make the output more readable.</p>



<p>The <code>\033[92m</code> part is an ANSI color code, which is a special character sequence that tells the terminal to change the color of the text. The <code>\033[0m</code> part resets the color back to the default.</p>



<h2 class="wp-block-heading">GPT Function</h2>


<div class="wp-block-image">
<figure data-wp-context="{&quot;imageId&quot;:&quot;69e5c37ae5958&quot;}" data-wp-interactive="core/image" data-wp-key="69e5c37ae5958" class="aligncenter size-full wp-lightbox-container"><img loading="lazy" decoding="async" width="1024" height="1024" data-wp-class--hide="state.isContentHidden" data-wp-class--show="state.isContentVisible" data-wp-init="callbacks.setButtonStyles" data-wp-on--click="actions.showLightbox" data-wp-on--load="callbacks.setButtonStyles" data-wp-on-window--resize="callbacks.setButtonStyles" src="https://blog.finxter.com/wp-content/uploads/2023/11/2c4f809c-0a77-4a46-ad56-59784ec8f975.webp" alt="" class="wp-image-1653296" srcset="https://blog.finxter.com/wp-content/uploads/2023/11/2c4f809c-0a77-4a46-ad56-59784ec8f975.webp 1024w, https://blog.finxter.com/wp-content/uploads/2023/11/2c4f809c-0a77-4a46-ad56-59784ec8f975-300x300.webp 300w, https://blog.finxter.com/wp-content/uploads/2023/11/2c4f809c-0a77-4a46-ad56-59784ec8f975-150x150.webp 150w, https://blog.finxter.com/wp-content/uploads/2023/11/2c4f809c-0a77-4a46-ad56-59784ec8f975-768x768.webp 768w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /><button
			class="lightbox-trigger"
			type="button"
			aria-haspopup="dialog"
			aria-label="Enlarge"
			data-wp-init="callbacks.initTriggerButton"
			data-wp-on--click="actions.showLightbox"
			data-wp-style--right="state.imageButtonRight"
			data-wp-style--top="state.imageButtonTop"
		>
			<svg xmlns="http://www.w3.org/2000/svg" width="12" height="12" fill="none" viewBox="0 0 12 12">
				<path fill="#fff" d="M2 0a2 2 0 0 0-2 2v2h1.5V2a.5.5 0 0 1 .5-.5h2V0H2Zm2 10.5H2a.5.5 0 0 1-.5-.5V8H0v2a2 2 0 0 0 2 2h2v-1.5ZM8 12v-1.5h2a.5.5 0 0 0 .5-.5V8H12v2a2 2 0 0 1-2 2H8Zm2-12a2 2 0 0 1 2 2v2h-1.5V2a.5.5 0 0 0-.5-.5H8V0h2Z" />
			</svg>
		</button></figure>
</div>


<p>Now let&#8217;s start on our GPT function:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">def ask_weather_gpt(query, message_history=None, simulate_failure=False):
    need_to_fail_once = simulate_failure
    messages = []</pre>



<p>We&#8217;re going to take a query as input, and optionally a message history. So if we want to call this for a second time with an already established message history of the messages sent between chat GPT and the user, we can call this function again and pass in the already existing message history.</p>



<p>I&#8217;m going to also add this key <code>simulate_failure</code> because we&#8217;re going to be building in something just in case ChatGPT fails somehow. And we want to be able to test this because it&#8217;s actually not that likely that it will fail. So we&#8217;re going to have a very simple <code>simulate_failure</code> feature so we can just test that our fail-saves are working.</p>



<p>Then we&#8217;re going to have this variable <code>need_to_fill_once</code>, just a boolean value based on whether something was passed in here or not.</p>



<p>And then we&#8217;re going to create the messages. This is just going to be a list that&#8217;s going to have all the messages. So perhaps the system message first, which tells chat GPT you&#8217;re a helpful assistant that&#8217;s supposed to do this or that, then we maybe have the user message with a query. Then we could have the assistant message from chat GPT coming back to us, giving us an answer. And all of these messages are going to be appended to this list.</p>



<p><strong>So, every single time chat GPT sends us a response, we will be appending it to this message history list.</strong></p>



<p>As we&#8217;re going to be doing this several times over, and there&#8217;s also currently a small bug that we want to avoid, we&#8217;re going to create a small inner function that we can call every single time we want to append something to this list:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">    def handle_and_append_response(response):
        """
        Appends message to history and extracts the message,
        prevents a current bug by explicitly setting .content and .function_call
        """
        response_message = response.choices[0].message
        if response_message.content is None:
            response_message.content = ""
        if response_message.function_call is None:
            del response_message.function_call
        messages.append(response_message)
        return response_message</pre>



<p>This may look a little bit confusing, but let&#8217;s go over it. So this function is going to handle and append a response. When we make a call to chat GPT and we get a response in return, we&#8217;re going to just put the response into this function.</p>



<p>This inner <code>handle_and_append_response</code> function is going to append the message to the history, or the list of &#8216;<code>messages</code>&#8216;. </p>



<p>First, we extract the message from the response and save it as &#8216;<code>response_message</code>&#8216;. Then we&#8217;re going to prevent a current bug by explicitly setting the <code>.content</code> and <code>.function_call</code>.</p>



<p>So if the response message <code>.content</code> is none, which is the case when ChatGPT tries to call a function, we&#8217;re going to set the response message&#8217;s <code>.content</code> to an empty string.</p>



<p><strong>Now, why do we do this? </strong></p>



<p>There&#8217;s currently a bug that if you append a message to the messages history, and then you send this back to ChatGPT, it&#8217;s going to complain that there&#8217;s no <code>message.content</code>. So this is kind of a bug that we&#8217;re circumventing by making sure this key exists, even though it&#8217;s an empty string.</p>



<p>The same goes for the second one, if <code>response_message.function_call</code> is <code>None</code>, then we&#8217;re just going to get rid of this particular key, just to make sure it doesn&#8217;t bug out on us. If you&#8217;re watching this in the future, they may have actually fixed this so you can try removing these lines later on.</p>



<p>Then we&#8217;re just going to take the messages and append our response message with these small edits and also return the <code>response_message</code> from the function.</p>



<p>Now we&#8217;re outside of the inner function again:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">def ask_weather_gpt(...)
    ...
    def handle_and_append_response(...)
        ...
        ...

    # continue down here outside the inner function
    if message_history:
        messages = message_history
    else:
        messages = [
            {"role": "system", "content": current_and_forecast_setup},
            {"role": "user", "content": query},
        ]</pre>



<p>If we passed in a <code>message_history</code> as an argument when calling the function, we&#8217;re going to use that as the <code>message_history</code>. </p>



<p>Otherwise, we&#8217;ll define a basic message history with a system message containing our prompt from <code>prompt_setup.py</code> and the user query in the second message.</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">    tools = [
        describe_get_current_weather,
        describe_get_weather_forecast,
    ]</pre>



<p>Now we create a list of &#8216;<code>tools</code>&#8216;. </p>



<p>Notice OpenAI has adopted the LangChain tool naming convention. This is a list of the descriptions of the functions from <code>function_descriptions.py</code>, and not the actual functions themselves!</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">    response = client.chat.completions.create(
        model=MODEL,
        messages=messages,
        tools=tools,
        tool_choice="auto",
    )</pre>



<p>Now we make a call to ChatGPT, passing in our model, messages, and tools. We set the tool choice to <code>'auto'</code> to let ChatGPT decide if and which function(s) it should call. You can force a call by setting a specific tool name here.</p>



<p>Now we&#8217;re going to handle the response using our inner function:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">    response_message = handle_and_append_response(response)</pre>



<p>Which means it&#8217;s now also in our messages list.</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">    while response_message.tool_calls:
        tool_calls = response_message.tool_calls
        available_functions = {
            "get_current_weather": get_current_weather,
            "get_weather_forecast": get_weather_forecast,
        }</pre>



<p>We open a while loop. As long as ChatGPT wants to call a function, the <code>response_message</code> will have a <code>.tool_calls</code> attribute, which is a list of the functions it wants to call. </p>



<p>So, while ChatGPT wants to call functions we will run this loop. We save this list as <code>tool_calls</code>. Then we define a <a href="https://blog.finxter.com/python-dictionary/" data-type="post" data-id="5232">simple dictionary</a> of available functions, mapping the function names we gave to ChatGPT to the actual functions we defined in <code>weather.py</code>.</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">        try:
            if need_to_fail_once:
                need_to_fail_once = False
                raise Exception("Simulating failure")
            for call in tool_calls:
                func_name: str = call.function.name
                func_to_call: Callable = available_functions[func_name]
                func_args: dict = json.loads(call.function.arguments)
                func_response = func_to_call(**func_args)

                messages.append(
                    {
                        "tool_call_id": call.id,
                        "role": "tool",
                        "name": func_name,
                        "content": func_response,
                    }
                )</pre>



<p>We run a try/except block from now on. Remember ChatGPT is generating the function names and input arguments from now on. If it makes any name or syntax mistakes our function might blow up, which is why we use a try and except block to catch any errors and handle them.</p>



<p>First, if the <code>need_to_fail_once</code> variable is set to true, we simulate failure by raising an exception. We also make sure to set the variable to false so we only raise an exception once. By raising an exception we force the except block to run so we can test out our fail-safe code.</p>



<p>Then we loop over each call in the <code>tool_calls</code> list. We extract the function name from the call, and then we get the actual function from our <code>available_functions</code> dictionary. </p>



<p>We also extract the function arguments from the call and convert them from a string to a dictionary using <code>json.loads</code>.</p>



<p>We then call the function passing in the arguments dictionary using the <a href="https://blog.finxter.com/python-double-asterisk/" data-type="post" data-id="14514">asterisk <code>**</code> operator</a>. Finally, we append a message to our messages list, containing the <code>tool_call_id</code>, the role, the name of the function, and the response from the function. </p>



<p>We will feed this message history back to ChatGPT again later and the call id helps ChatGPT discern which answer is related to which function call, as multiple functions are being called in parallel here.</p>



<p>Now we go to the except block:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">        except:
            messages.pop()
            messages.append(
                {
                    "role": "system",
                    "content": "Based on the above information, please generate the appropriate tool calls with valid arguments per the schema provided.",
                }
            )
            return ask_weather_gpt(query, message_history=messages)</pre>



<p>If we get an exception, we pop the last message from the messages list, and then we append a system message telling ChatGPT to generate the appropriate tool calls with valid arguments per the schema provided. </p>



<p>Then we call the <code>ask_weather_gpt</code> function again, passing in the query and the message history, which now contains the system message we just appended.</p>



<p>Basically what this comes down to is that ChatGPT generated faulty arguments, we popped this generation off the stack and put in a system message reminding ChatGPT to generate correct arguments. Then we return out of this function by calling the function itself again passing in our message history with the reminder.</p>



<p>This is not actually a perfect error handling at all, but I just want to give you a starting point, an idea from which you can start to build your own error handling, without making this example too complex.</p>



<p>Now outside the try/catch block:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">        response = client.chat.completions.create(
            model=MODEL,
            messages=messages,
        )

        response_message = handle_and_append_response(response)

        quick_dirty_printer(messages)
        return response_message

    quick_dirty_printer(messages)
    return response_message</pre>



<p>We make a second request to ChatGPT passing in the message history which now contains all the responses from the functions we called. </p>



<p>We then handle the response using our inner function, append the response to the message history, print the messages using our quick and dirty printer utility function, and return the response message.</p>



<p>After that, we call the quick and dirty printer and return the response once more, but notice this is indented one level more to the outside. </p>



<p>In case there was no function call to begin with, the user asked a question that didn&#8217;t require a function call, we just bypass the whole while loop and directly print the messages and return the response message.</p>



<p>As this can be a bit confusing in snippets here is the whole function once more:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">def ask_weather_gpt(query, message_history=None, simulate_failure=False):
    need_to_fail_once = simulate_failure
    messages = []

    def handle_and_append_response(response):
        """
        Appends message to history and extracts the message,
        prevents a current bug by explicitly setting .content and .function_call
        """
        response_message = response.choices[0].message
        if response_message.content is None:
            response_message.content = ""
        if response_message.function_call is None:
            del response_message.function_call
        messages.append(response_message)
        return response_message

    if message_history:
        messages = message_history
    else:
        messages = [
            {"role": "system", "content": current_and_forecast_setup},
            {"role": "user", "content": query},
        ]

    tools = [
        describe_get_current_weather,
        describe_get_weather_forecast,
    ]

    response = client.chat.completions.create(
        model=MODEL,
        messages=messages,
        tools=tools,
        tool_choice="auto",
    )

    response_message = handle_and_append_response(response)

    while response_message.tool_calls:
        tool_calls = response_message.tool_calls
        available_functions = {
            "get_current_weather": get_current_weather,
            "get_weather_forecast": get_weather_forecast,
        }

        try:
            if need_to_fail_once:
                need_to_fail_once = False
                raise Exception("Simulating failure")
            for call in tool_calls:
                func_name: str = call.function.name
                func_to_call: Callable = available_functions[func_name]
                func_args: dict = json.loads(call.function.arguments)
                func_response = func_to_call(**func_args)

                messages.append(
                    {
                        "tool_call_id": call.id,
                        "role": "tool",
                        "name": func_name,
                        "content": func_response,
                    }
                )

        except:
            messages.pop()
            messages.append(
                {
                    "role": "system",
                    "content": "Based on the above information, please generate the appropriate tool calls with valid arguments per the schema provided.",
                }
            )
            return ask_weather_gpt(query, message_history=messages)

        response = client.chat.completions.create(
            model=MODEL,
            messages=messages,
        )

        response_message = handle_and_append_response(response)

        quick_dirty_printer(messages)
        return response_message

    quick_dirty_printer(messages)
    return response_message</pre>



<h2 class="wp-block-heading">Running a Single Function Call</h2>



<p>So let&#8217;s try it out! Add the following print statement and run your file:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">ask_weather_gpt("What's the weather in San Francisco?", simulate_failure=False)</pre>



<p>You should see something like this in your terminal:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">{'role': 'system', 'content': 'You are a regular ChatGPT chatbot, just like normal, however you also have access to some functions that can be called if you need them. One will provide the current weather and one will provide the weather forecast. IF THE USER DOES NOT ASK A WEATHER RELATED QUESTION JUST ANSWER THEM AS NORMAL WITHOUT CALLING ANY FUNCTIONS.'}

{'role': 'user', 'content': "What's the weather in San Francisco?"}

ChatCompletionMessage(content='', role='assistant', tool_calls=[ChatCompletionMessageToolCall(id='call_8oWdO9OoMqwXUp7QEE7kaaCX', function=Function(arguments='{"location":"San Francisco"}', name='get_current_weather'), type='function')])

{'tool_call_id': 'call_8oWdO9OoMqwXUp7QEE7kaaCX', 'role': 'tool', 'name': 'get_current_weather', 'content': '{"location": {"name": "San Francisco", "region": "California", "country": "United States of America", "lat": 37.78, "lon": -122.42, "tz_id": "America/Los_Angeles", "localtime_epoch": 1699771110, "localtime": "2023-11-11 22:38"}, "current": {"last_updated_epoch": 1699770600, "last_updated": "2023-11-11 22:30", "temp_c": 11.1, "temp_f": 52.0, "is_day": 0, "condition": {"text": "Clear", "icon": "//cdn.weatherapi.com/weather/64x64/night/113.png", "code": 1000}, "wind_mph": 2.2, "wind_kph": 3.6, "wind_degree": 10, "wind_dir": "N", "pressure_mb": 1019.0, "pressure_in": 30.1, "precip_mm": 0.0, "precip_in": 0.0, "humidity": 83, "cloud": 0, "feelslike_c": 11.5, "feelslike_f": 52.6, "vis_km": 16.0, "vis_miles": 9.0, "uv": 1.0, "gust_mph": 1.3, "gust_kph": 2.1}}'}

ChatCompletionMessage(content='The current weather in San Francisco is clear with a temperature of 52.0°F. The wind is blowing at 3.6 km/h from the north, and the humidity is at 83%.', role='assistant', tool_calls=None)</pre>



<p>So first we have the system message we set up followed by the user query. We can then see that ChatGPT sends us a request to call functions and passes the arguments to us and also gives this call an id. We then have the tool call results with the matching ID to link them together and finally, ChatGPT gives us a readable final answer!</p>



<h2 class="wp-block-heading">Testing the Failsafe</h2>


<div class="wp-block-image">
<figure data-wp-context="{&quot;imageId&quot;:&quot;69e5c37ae7915&quot;}" data-wp-interactive="core/image" data-wp-key="69e5c37ae7915" class="aligncenter size-full wp-lightbox-container"><img loading="lazy" decoding="async" width="1024" height="1024" data-wp-class--hide="state.isContentHidden" data-wp-class--show="state.isContentVisible" data-wp-init="callbacks.setButtonStyles" data-wp-on--click="actions.showLightbox" data-wp-on--load="callbacks.setButtonStyles" data-wp-on-window--resize="callbacks.setButtonStyles" src="https://blog.finxter.com/wp-content/uploads/2023/11/f6f7ecae-b4a8-4379-8a1a-56bb5241e709.webp" alt="" class="wp-image-1653297" srcset="https://blog.finxter.com/wp-content/uploads/2023/11/f6f7ecae-b4a8-4379-8a1a-56bb5241e709.webp 1024w, https://blog.finxter.com/wp-content/uploads/2023/11/f6f7ecae-b4a8-4379-8a1a-56bb5241e709-300x300.webp 300w, https://blog.finxter.com/wp-content/uploads/2023/11/f6f7ecae-b4a8-4379-8a1a-56bb5241e709-150x150.webp 150w, https://blog.finxter.com/wp-content/uploads/2023/11/f6f7ecae-b4a8-4379-8a1a-56bb5241e709-768x768.webp 768w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /><button
			class="lightbox-trigger"
			type="button"
			aria-haspopup="dialog"
			aria-label="Enlarge"
			data-wp-init="callbacks.initTriggerButton"
			data-wp-on--click="actions.showLightbox"
			data-wp-style--right="state.imageButtonRight"
			data-wp-style--top="state.imageButtonTop"
		>
			<svg xmlns="http://www.w3.org/2000/svg" width="12" height="12" fill="none" viewBox="0 0 12 12">
				<path fill="#fff" d="M2 0a2 2 0 0 0-2 2v2h1.5V2a.5.5 0 0 1 .5-.5h2V0H2Zm2 10.5H2a.5.5 0 0 1-.5-.5V8H0v2a2 2 0 0 0 2 2h2v-1.5ZM8 12v-1.5h2a.5.5 0 0 0 .5-.5V8H12v2a2 2 0 0 1-2 2H8Zm2-12a2 2 0 0 1 2 2v2h-1.5V2a.5.5 0 0 0-.5-.5H8V0h2Z" />
			</svg>
		</button></figure>
</div>


<p>Before we get into multiple function calls let&#8217;s quickly test out our fail-safe. Change the print statement to this:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">ask_weather_gpt("What's the weather in San Francisco?", simulate_failure=True)</pre>



<p>Your output should look exactly the same as above, but with one extra entry in between the user query and the ChatGPT message:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">{'role': 'system', 'content': 'setup...'}

{'role': 'user', 'content': "What's the weather in San Francisco?"}

{'role': 'system', 'content': 'Based on the above information, please generate the appropriate tool calls with valid arguments per the schema provided.'}

ChatCompletionMessage(content='', role='assistant', tool_calls=[....])

{'tool_call_id': 'call_QrLTbUe3RqPYyfbXUkyIVqZe', 'role': 'tool', 'name': 'get_current_weather', 'content': '....'}</pre>



<p>Exactly as expected, the error triggered, sent the first messages back to ChatGPT again with the third one appended, reminding the model to please generate appropriate tool calls with valid arguments and made a new call like nothing happened.</p>



<h2 class="wp-block-heading">Running Parallel Function Calls</h2>



<p>Comment out the above print statement and let&#8217;s try parallel function calls now:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">ask_weather_gpt(
    "Please give me the current weather in Seoul and the weather forecast in Amsterdam for the coming three days."
)</pre>



<p>And you can see two function calls being sent back simultaneously. We then call both functions and return the results to ChatGPT which gives us the final answer:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">{'role': 'system', 'content': 'setup....'}

{'role': 'user', 'content': 'Please give me the current weather in Seoul and the weather forecast in Amsterdam for the coming three days.'}

ChatCompletionMessage(content='', role='assistant', tool_calls=[ChatCompletionMessageToolCall(id='call_QKpzrTXdoh2Carn0bvyhott5', function=Function(arguments='{"location": "Seoul"}', name='get_current_weather'), type='function'), ChatCompletionMessageToolCall(id='call_n7BmrrjgnKEAROSZHaOoWxLf', function=Function(arguments='{"location": "Amsterdam", "days": 3}', name='get_weather_forecast'), type='function')])

{'tool_call_id': 'call_QKpzrTXdoh2Carn0bvyhott5', 'role': 'tool', 'name': 'get_current_weather', 'content': '{......}'}

{'tool_call_id': 'call_n7BmrrjgnKEAROSZHaOoWxLf', 'role': 'tool', 'name': 'get_weather_forecast', 'content': '{......}'}

ChatCompletionMessage(content='The current weather in Seoul is sunny with a temperature of 5°C (41°F). The wind is blowing from the north at 11.2 km/h.\n\nIn Amsterdam, the current weather is foggy with a temperature of 5°C (41°F). Over the next three days, expect patchy rain with a
high of 8.8°C (47.8°F) and a low of 6.0°C (42.8°F) tomorrow, followed by moderate rain with a high of 14.6°C (58.3°F) and a low of 5.9°C (42.6°F) the day after, and more moderate rain with a high of 11.9°C (53.4°F) and a low of 10.1°C (50.2°F) on the third day.', role='assistant', tool_calls=None)</pre>



<p>Perfect! We can now call multiple tools at the same time without having to loop through ChatGPT several times, greatly speeding up the process.</p>



<h2 class="wp-block-heading">Asking a Simple Question</h2>



<p>Finally ask a normal query to make sure ChatGPT will still answer normal questions without calling functions when not needed:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">ask_weather_gpt("What is a zombie watermelon?")

{'role': 'system', 'content': 'You are a regular ChatGPT chatbot, just like normal, however you also have access to some functions that can be called if you need them. One will provide the current weather and one will provide the weather forecast. IF THE USER DOES NOT ASK A WEATHER RELATED QUESTION JUST ANSWER THEM AS NORMAL WITHOUT CALLING ANY FUNCTIONS.'}

{'role': 'user', 'content': 'What is a zombie watermelon?'}

ChatCompletionMessage(content='A "zombie watermelon" is a term that\'s used for a watermelon that has been left in the field for an extended period, causing it to turn mushy and ooze behind its rind after being picked. This causes the inside to rot while the exterior remains vibrant
and green, hence the name "zombie watermelon." It\'s not an official term but more of a colloquial description.', role='assistant', tool_calls=None)</pre>



<p>Yep, ChatGPT goes straight into the answer. Now that you are up to date with parallel function calls and the new syntax, let&#8217;s look at the new JSON mode and seeds in the next part. See you there soon!</p>


<div class="wp-block-image">
<figure class="aligncenter size-full"><a href="https://academy.finxter.com/university/openai-update/"><img loading="lazy" decoding="async" width="805" height="530" src="https://blog.finxter.com/wp-content/uploads/2023/11/image-130.png" alt="" class="wp-image-1653284" srcset="https://blog.finxter.com/wp-content/uploads/2023/11/image-130.png 805w, https://blog.finxter.com/wp-content/uploads/2023/11/image-130-300x198.png 300w, https://blog.finxter.com/wp-content/uploads/2023/11/image-130-768x506.png 768w" sizes="auto, (max-width: 805px) 100vw, 805px" /></a></figure>
</div>


<p class="has-base-2-background-color has-background"><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f449.png" alt="👉" class="wp-smiley" style="height: 1em; max-height: 1em;" /> <strong>Full Course</strong>: <a href="https://academy.finxter.com/university/openai-update/" data-type="link" data-id="https://academy.finxter.com/university/openai-update/">OpenAI API Mastery: Innovating with GPT-4 Turbo, Text-to-Speech (TTS), and DALL·E 3</a></p>
<p>The post <a href="https://blog.finxter.com/openai-parallel-function-calling-weathergpt-example/">OpenAI Parallel Function Calling (WeatherGPT Example)</a> appeared first on <a href="https://blog.finxter.com">Be on the Right Side of Change</a>.</p>
]]></content:encoded>
					
		
		
			</item>
	</channel>
</rss>

<!--
Performance optimized by W3 Total Cache. Learn more: https://www.boldgrid.com/w3-total-cache/?utm_source=w3tc&utm_medium=footer_comment&utm_campaign=free_plugin

Page Caching using Disk: Enhanced 
Minified using Disk

Served from: blog.finxter.com @ 2026-04-20 06:11:07 by W3 Total Cache
-->