5 Best Ways to Save HTML String to PDF in Python

πŸ’‘ Problem Formulation: You’re a Python developer seeking ways to convert an HTML string into a PDF file. For instance, you may want to generate PDF reports from HTML templates on the fly. Imagine an input HTML string containing markup for an invoice and the desired output as a formatted PDF document. This article explores solutions to this common coding task.

Method 1: Using WeasyPrint

WeasyPrint is a visual rendering engine for HTML and CSS that can export to PDF. It is designed for web developers that need to generate attractive PDFs using Python without hassle. Notably, WeasyPrint can handle complex document layouts and is pure Python.

Here’s an example:

from weasyprint import HTML

html_string = "<p>Hello, WeasyPrint!</p>"
HTML(string=html_string).write_pdf("output.pdf")

The output of this code is a PDF file named “output.pdf” containing “Hello, WeasyPrint!”.

WeasyPrint takes the raw HTML string and converts it into a PDF document. This method is direct and functional for users who need reliable rendering, including CSS and web fonts. It’s suitable for complex layouts and high-quality printouts.

Method 2: Using pdfkit and wkhtmltopdf

pdfkit is a Python wrapper for the wkhtmltopdf command-line tool, which converts HTML to PDF using WebKit. This tool is particularly effective when you need exact web rendering.

Here’s an example:

import pdfkit

html_string = "<p>Welcome to pdfkit!</p>"
pdfkit.from_string(html_string, 'output.pdf')

The output is a PDF file “output.pdf” featuring “Welcome to pdfkit!”.

This code snippet takes the HTML string and, through pdfkit, interfaces with wkhtmltopdf to create a PDF. This method is noteworthy for its accurate rendering that mirrors how content would appear in a web browser.

Method 3: Using ReportLab

ReportLab is a robust library for generating PDFs in Python. It’s known for its speed and ability to handle complex PDFs. However, it deals with a lower-level API and could involve a steeper learning curve.

Here’s an example:

from reportlab.pdfgen import canvas

def save_html_to_pdf(html_string, filename):
    c = canvas.Canvas(filename)
    c.drawString(72, 72, html_string)
    c.save()

html_string = "Sample HTML content"
save_html_to_pdf(html_string, "output.pdf")

A “output.pdf” PDF containing “Sample HTML content” will be created.

In this snippet, ReportLab creates a PDF canvas, places the HTML string at a specified location, and saves it to a file. It offers extensive control for custom layouts but might require more code for complex HTML.

Method 4: Using xhtml2pdf

xhtml2pdf is another Python library that allows HTML/CSS to PDF conversion. It is an HTML-to-PDF conversion tool designed to convert HTML pages, including CSS to PDF.

Here’s an example:

import pdfkit

html_string = "<p>Hello, xhtml2pdf!</p>"
pdfkit.from_string(html_string, 'output.pdf')

The output is a PDF file “output.pdf” that contains “Hello, xhtml2pdf!” text.

This example demonstrates xhtml2pdf’s ability to parse an HTML string and save it as a PDF. This method is user-friendly for those who are familiar with HTML/CSS.

Bonus One-Liner Method 5: Using PyFPDF

PyFPDF is a simple PDF generation library for Python that allows adding text, images, and more to a document. This method is best for quick and simple PDF generation without needing heavy HTML rendering support.

Here’s an example:

from fpdf import FPDF

pdf = FPDF()
pdf.add_page()
pdf.set_font("Arial", size=12)
pdf.cell(200, 10, txt="Hello, PyFPDF!", ln=True)
pdf.output("output.pdf")

A PDF file “output.pdf” with the text “Hello, PyFPDF!” will be created.

The FPDF class is instantiated, a page is added, and text is written at a specific location before saving as a PDF. PyFPDF excels in simplicity and is suitable for quick PDF generation when dealing with plain text or images.

Summary/Discussion

  • Method 1: WeasyPrint. Handles complex layouts with CSS. Pure Python. Might require additional dependencies.
  • Method 2: pdfkit/wkhtmltopdf. Accurate web rendering. Needs separate installation of wkhtmltopdf. Good for web-focused content.
  • Method 3: ReportLab. Highly customizable. Fast and robust. Complex usage for HTML.
  • Method 4: xhtml2pdf. Easy for HTML/CSS users. Requires understanding of pisa server.
  • Bonus Method 5: PyFPDF. Quick, simple, no HTML rendering. Limited by plain text and image support.