How I Created an Audiobook App with Streamlit

Welcome to another project tutorial using the Streamlit Library. It’s been a while since I created projects using this framework. The previous project tutorial was on creating a weather app which was the third series on developing a single application using three Python frameworks.

In this tutorial, we will learn to design something really interesting, that is, an audio app.

Many of us do enjoy reading books especially novels and motivational books. However, due to your busy schedule, you may find it difficult to set aside time, relax on a sofa, and read that interesting book you have or you borrowed from a friend.

All hope is not lost. You can still read even at the go through an audiobook. Many find it enjoyable listening to audiobooks while going about their normal activities for the day.

As Python developers, this will be a nice project to create and add to our portfolio. That is exactly what we are going to do in this project tutorial.

We will create a Streamlit application where users can upload a book in PDF format and have it converted to an audiobook. The user can then decide to listen to the audio while attending to other tasks or download it to listen to later.

👉 Try it yourself in this interactive Streamlit app I created for you

Creating the Application

I assume you have already created a virtual environment for this project. Let’s get started. The code will span just a few lines. So, it will be easy to understand. Moreover, I will explain everything step by step.

Step 1: Import the required modules

import streamlit as st
from gtts import gTTS
import pdfplumber
import docx
import ebooklib
from ebooklib import epub

Streamlit is used to create the Streamlit application, gTTS (Google’s Text-to-Speech) is used to convert text to speech, pdfplumber is used to extract text from PDF files, docx is used to extract text from DOCX files, and ebooklib and epub are used to extract text from EPUB files.

Step 2: Create a file uploader widget

st.title("Streamlit AudioBook App")'Convert your E-book to audiobook')
book = st.file_uploader('Please upload your file', type=['pdf', 'txt', 'docx', 'epub'])

After setting the title and subtitle of the Streamlit application, we create a file uploader widget that allows the user to upload a file. The type parameter specifies the allowed file types, which in this case are PDF, TXT, DOCX, and EPUB. This gives you the option to upload other file extensions other than PDF.

Step 3: Create a function to handle file extraction based on the extension

def extract_text_from_docx(file):
    doc = docx.Document(file)
    full_text = []
    for para in doc.paragraphs:
    return '\n'.join(full_text)

def extract_text_from_epub(file):
    book = epub.read_epub(file)
    chapters = []
    for item in book.get_items():
        if item.get_type() == ebooklib.ITEM_DOCUMENT:
    return '\n'.join(chapters)

We define two functions, extract_text_from_docx and extract_text_from_epub, which are used to extract text from DOCX and EPUB files, respectively.

The extract_text_from_docx function uses the docx library to read the contents of a DOCX file and extract the text from its paragraphs.

The extract_text_from_epub function uses the ebooklib and epub libraries to read the contents of an EPUB file and extract the text from its chapters.

An EPUB file is made up of multiple items, each of which can have a different type. Some common item types include ITEM_DOCUMENT, which represents the main content of the book, ITEM_IMAGE, which represents images, and ITEM_STYLE, which represents style sheets.

The condition is used to check if the current item being processed is a document item. If it is, the content of the item is extracted and appended to the chapters list. This ensures that only the main content of the book is included in the final text, and other items such as images and style sheets are ignored.

Step 4: Check if the file has been uploaded and extract it

if book:
    if book.type == 'application/pdf':
        all_text = ""
        with as pdf:
            for text in pdf.pages:
                single_page_text = text.extract_text()
                all_text = all_text + '\n' + str(single_page_text)

Once the book is uploaded, it checks the type. If the uploaded file is a PDF file, it then uses the pdfplumber library to open the PDF file and extract its text.

The text from each page of the PDF is concatenated into a single string, separated by newline characters.

elif book.type == 'text/plain':
        all_text ="utf-8")
elif book.type == 'application/vnd.openxmlformats-officedocument.wordprocessingml.document':
        all_text = extract_text_from_docx(book)
elif book.type == 'application/epub+zip':
        all_text = extract_text_from_epub(book)

If the uploaded file is a TXT file, it reads its contents directly and decodes them as UTF-8 text. If the uploaded file is a DOCX file, the code calls the extract_text_from_docx function to extract its text. Whereas, if the uploaded file is an EPUB, it calls the exract_text_from_epub function to extract the text.

MIME types are used to identify the format of a file or data, and are commonly used in email attachments, HTTP headers, and other contexts where it’s important to specify the type of data being transmitted. application/vnd.openxmlformats-officedocument.wordprocessingml.document is a MIME type that represents a Microsoft Word document in the Office Open XML (OOXML) format. 

Step 5: Convert the extracted text to speech

tts = gTTS(all_text)'audiobook.mp3')

Using the gTTS library, we convert the extracted text into speech. The gTTS function takes the text as input and returns a gTTS object. The save method of this object is then called to save the generated speech as an MP3 file named audiobook.mp3.

Once the code is run, you should see an audiobook.mp3 file in your project folder.

Step 6: Open and read the content of the file

audio_file = open('audiobook.mp3', 'rb')
audio_bytes =

The audiobook.mp3 file is opened in binary mode and its contents are read into a variable named audio_bytes. This variable contains the binary data of the MP3 file.

Step 7: Create an audio player widget, format='audio/wav', start_time=0)

We use Streamlit’s audio function to create an audio player widget that allows the user to play back the generated speech. The audio_bytes variable is passed as the first argument to specify the audio data to be played.

The format parameter is set to 'audio/wav' to specify the format of the audio data, and the start_time parameter is set to 0 to start playback from the beginning of the audio.


Congratulations! You just created an audiobook app in just seven steps. Check the full code on GitHub. The three dots at the audio widget give you the option to download the audio file if you want to listen later. Please note that the conversion may take some time depending on your internet network and the size of the file.

Now, all you need to do is to deploy the app on Streamlit Cloud. Then, upload your favorite book and enjoy listening on the go.