Image to PDF Converter and PDF Merger | Python

Project Description

In my university days, I often came across scenarios where I needed to convert image files to PDF files and then merge all the PDF files together to submit my assignments. Now, you will find tons of online resources to convert images to PDFs and also merge PDFs. But the big question is – “Are they all safe?”

That is why I decided to take things into my hands and create a script that would not only convert my image files to PDFs but also merge those PDFs together. That is exactly what I will be demonstrating in this project.

So we will be performing a couple of tasks in this project –

  • Convert all the images to PDF files.
  • Merge all the converted PDF files into a single PDF file.

So, without further delay, let us dive into our project.

Step 1: Install and Import the Necessary Libraries

We will need to install a couple of libraries that will help us to complete our task. The first library is the PIL (Python Imaging Library) which is Python’s  de facto image processing package. To install it, open your terminal and type the following command:

pip install pillow

The next library that you need to install is known as PyPDF2. PyPDF2 is a free and open-source pure-python PDF library capable of splitting, merging, cropping, and transforming the pages of PDF files. It can also add custom data, viewing options, and passwords to PDF files. PyPDF2 can retrieve text and metadata from PDFs as well. To install it, open your terminal and type the following command:

pip install PyPDF2

Once you have installed the necessary libraries, go ahead and import them into your script. Note that you will also need to import the os module to open the required files from their respective paths.

Code:

import os
from PIL import Image
from PyPDF2 import PdfMerger

Step 2: Fetch the Path of the Source and Destination Directories

You need to fetch the path of the source folder where you have stored the images and also the path of the destination folder where you will save the PDF files.

Code:

img_dir = './image_files'
pdf_dir = './pdf_files'

In my case, I have created two different directories by the name ‘image_files‘ and ‘pdf_files‘ within my project folder and then stored them in two different variables which I will be using later on in my code.

Step 3: Converting Image to PDF

We are all set to create the image to PDF converter function that will convert an image into a PDF. The idea is to navigate the image folder with the help of the os.listdir method and grab all the image files within it. If an image file is located we open it up using the Image module of the PIL package.

You then have to specify the color profile of the PDF and you can mention that to be RGB. You can do this with the help of the convert function. Then you can directly save this converted RGB image to the destination folder in the PDF format using the save method. To save it as a PDF file you can pass the extension as .pdf as '{0}.pdf'.format(file.split('.')[-2]). That’s it. This should convert all the images in the images folder to individual PDF files.

Code:

def img_to_pdf_converter():
    for file in os.listdir(img_dir):
        if file.split('.')[-1] in ('png', 'jpg', 'jpeg'):
            image = Image.open(os.path.join(img_dir, file))
            coneverted_image = image.convert('RGB')
            coneverted_image.save(os.path.join(pdf_dir, '{0}.pdf'.format(file.split('.')[-2])))

    print("PDF Created!")

Step 4: Merge the PDFs

Once you have all the PDF versions of the image files, you can then merge them using the PyPDF2 library. Go ahead and create an empty list that will store the names of all the PDFs that were created from the image files. Then create an instance of the PdfMerger class that resides with the PyPDF2 module.

Then navigate the PDF folder and fetch all the PDF files and merge them together using a for loop and within the for loop use the append method to merge them together. As simple as that!

Code:

def merger():
    pdfs = []
    merge = PdfMerger()
    for file in os.listdir(pdf_dir):
        pdfs.append(pdf_dir+"/"+file)

    for pdf in pdfs:
        merge.append(pdf)

    merge.write('merged_pdf.pdf')
    merge.close()
    print("PDFs Merged!")

Putting It All Together

We have successfully created both functions to convert images to PDFs and then merge them. All that remains to be done is to call these functions and your script should work like a charm. πŸ˜‰

Finally, when you put everything together, this is how the complete script looks like –

import os
img_dir = './image_files'
pdf_dir = './pdf_files'

def img_to_pdf_converter():
    from PIL import Image
    for file in os.listdir(img_dir):
        if file.split('.')[-1] in ('png', 'jpg', 'jpeg'):
            image = Image.open(os.path.join(img_dir, file))
            coneverted_image = image.convert('RGB')
            coneverted_image.save(os.path.join(pdf_dir, '{0}.pdf'.format(file.split('.')[-2])))

    print("PDF Created!")

def merger():
    pdfs = []
    from PyPDF2 import PdfMerger
    merge = PdfMerger()
    for file in os.listdir(pdf_dir):
        pdfs.append(pdf_dir+"/"+file)

    for pdf in pdfs:
        merge.append(pdf)

    merge.write('merged_pdf.pdf')
    merge.close()
    print("PDFs Merged!")

img_to_pdf_converter()
merger()

Conclusion

Woohoo!!! We have successfully completed our fun project, and now we do not need the aid of any third-party application to convert our images to PDFs or merge our PDFs. I hope this project added some value and helped you in your coding quest. Stay tuned and subscribe for more interesting projects and tutorials.