Generating HTML Documents in Python

Rate this post

Problem Statement: How to generate HTML documents in Python?

One of the advantages of opting for Python as your programming language is that it is one of the most versatile languages as it emphasizes on code readability with extensive use of white space. It supports a large collection of libraries that serves various purposes, which include generating HTML documents in Python. Before we dive into the libraries, let us learn how we can actually write to an HTML file in Python.

How to write to an HTML file in Python?

You can create and save HTML files with the help of a few simple steps, as shown below.

  1. Use the open file function to create the HTML file.
  2. Add input data in HTML format into the file with the help of the write function.
  3. Finally, save and close the file.

Example:

# Creating the HTML file
file_html = open("demo.html", "w")

# Adding the input data to the HTML file
file_html.write('''<html>
<head>
<title>HTML File</title>
</head> 
<body>
<h1>Welcome Finxters</h1>       	
<p>Example demonstrating How to generate HTML Files in Python</p> 
</body>
</html>''')

# Saving the data into the HTML file
file_html.close()

Output: Here’s how the demo.html file looks like.

<html>
<head>
<title>HTML File</title>
</head> 
<body>
<h1>Welcome Finxters</h1>       	
<p>Example demonstrating How to generate HTML Files in Python</p> 
</body>
</html>

When you open it in the browser, it looks like this –

Method 1- Using The Airium Library

Airium is a bidirectional HTML-python translator that uses the DOM structure and is represented by the Python indentation with context managers. We need to install the airium module using the Python package installer by running the following code in the terminal: pip install airium == 0.2.3

The biggest advantage of using the Airium library in Python is that it also has a reverse translator. This translator helps to build the Python code out of the HTML string.

Example: The following example demonstrates how we can generate HTML docs using Airium.

# Importing the airium library
from airium import Airium

a = Airium()
# Generating HTML file
a('<!DOCTYPE html>')
with a.html(lang="pl"):
    with a.head():
        a.meta(charset="utf-8")
        a.title(_t="Example: How to use Airium library")
    with a.body():
        with a.h1(id="id23345225", kclass='main_header'):
            a("Hello Finxters")
# Casting the file to a string to extract the value
html = str(a)
# Casting the file to UTF-8 encoded bytes:
html_bytes = bytes(a)
print(html)

Output:

<!DOCTYPE html>
<html lang="pl">
  <head>
    <meta charset="utf-8" />
    <title>Example: How to use Airium library</title>
  </head>
  <body>
    <h1 id="id23345225" kclass="main_header">
      Hello Finxters
    </h1>
  </body>
</html>

You can also store this document as a file using the following code:

with open('file.html', 'wb') as f:
    f.write(bytes(html, encoding='utf8'))

Method 2- Using Yattag Library

Yattag is a Python library used to generate HTML or XML documents in a Pythonic way. If we are using the Yattag library, we don’t have to use the closing tag in HTML. It considers all the templates as the piece of code in Python. We can even render the HTML forms easily with default values and error messages. Before we dive into the solution, let us have a quick look at a few basics.

How does yattag.Doc class work?

Yattag.Doc works similarly to the join method of the string. When we create a Doc instance, it uses its method to append the content to it like the text method is used to append the text, whereas the tag method appends the HTML tag. Lastly, the getvalue method is used to return the whole HTML content as a large string. 

What is the tag method?

In Python, a tag method is an object that is used inside a with statement. It is used to return a context manager. The context managers have __enter__ and __exit__ methods where the __enter__ method is called at the starting of the with block and the __exit__ method is called when leaving the with block. The line: tag('h1') is used to create a <h1> tag.

Example:

# Importing the Yattag library
from yattag import Doc
doc, tag, text = Doc().tagtext()
with tag('html'):
    with tag('body'):
        with tag('p', id = 'main'):
            text('We can write any text here')
        with tag('a', href = '/my-link'):
            text('We can insert any link here')
result = doc.getvalue()
print(result)

Output:

<html><body><p id="main">We can write any text here</p><a href="/my-link">We can insert any link here</a></body></html>

It is easier and more readable to generate dynamic HTML documents with the Yattag library than to write the static HTML docs. 

However, most of the time, when you are generating HTML documents, most of the tag nodes will contain only text. Hence, we can use the following line method to write these in a terser way.

Example:

doc, tag, text, line = Doc().ttl()
with tag('ul', id = 'To-dos'):
    line('li', 'Clean up the dishes', kclass = "priority")
    line('li', 'Call for appointment')
    line('li', 'Complete the paper')

Output:

<ul id = 'To-dos'>
  <li class = "priority"> Clean up the dishes </li>
  <li> Call for appointment </li>
  <li> Complete the paper </li>
</ul>

Method 3- Using xml.etree

We can use the XML.etree package to generate some low-level HTML documents in Python. The XML.etree is a standard python package, and we need to import it into the program before utilizing it.

XML follows the hierarchical data format and is usually represented in the form of an element tree. The element tree also has two classes for this purpose. The first one is the ElementTree that represents the whole XML document as a tree and interacts with the whole document (reading and writing to and from the files.) The second class is the Element that represents a single node in this tree that interacts with a single XML element and its sub-elements.

Example:

# Importing the XML package and the sys module
import sys
from xml.etree import ElementTree as ET

html = ET.Element('html')
body = ET.Element('body')
html.append(body)
div = ET.Element('div', attrib={'class': 'foo'})
body.append(div)
span = ET.Element('span', attrib={'class': 'bar'})
div.append(span)
span.text = "Hello Finxters. This article explains how to generate HTML documents in Python."
# Here, the code checks the Python version.
if sys.version_info < (3, 0, 0):
    # If the Python version is less than 2.0
    ET.ElementTree(html).write(sys.stdout, encoding='utf-8', method='html')
else:
    # For versions Python 3 and above
    ET.ElementTree(html).write(sys.stdout, encoding='unicode', method='html')

Output:

<html><body><div class="foo"><span class="bar">Hello Finxters. This article explains how to generate HTML documents in Python.</span></div></body></html>

Conclsuion

That’s all about generating HTML documents in Python. I hope you found this article helpful. Please stay tuned and subscribe for more such interesting articles. Happy learning!

Authors: Rashi Agarwal and Shubham Sayon

Recommended Read: How to Get an HTML Page from a URL in Python?


Web Scraping with BeautifulSoup

One of the most sought-after skills on Fiverr and Upwork is web scraping .

Make no mistake: extracting data programmatically from web sites is a critical life-skill in today’s world that’s shaped by the web and remote work.

This course teaches you the ins and outs of Python’s BeautifulSoup library for web scraping.