Installing Beautiful Soup

Summary: To install BeautifulSoup in WIndows use the command: pip install beautifulsoup4. To install it in Linux use the command: sudo apt-get install python3-bs4.

Aim: In this tutorial we will discuss how to to install BeautifulSoup?

Since BeautifulSoup is not a Python standard library we need to install it before we can use it to scrape websites. Hence, we will have a look at the steps to install the BeautifulSoup 4 package (also called bs4) and also discuss some of the problems that come up after the installation.

Note: The current release is Beautiful Soup 4.9.3 (October 3, 2020). 

❂ Installing BeautifulSoup In Linux Machine

❖ Platform: Debian or Ubuntu

If you are using Python in a Debian or Ubuntu-based OS, you need to install Beautiful Soup with the help of the system package manager using the following command:

✻ For Python 2.x

$sudo apt-get install python-bs4

✻ For Python 3.x

$sudo apt-get install python3-bs4

In case you do not want to use the system package manager, you can use easy_install or pip to install bs4.

✻ Command to install bs4 using easy_install:

$easy_install beautifulsoup4

✻ Command to install bs4 using pip:

$pip install beautifulsoup4

# Note: If you are using Python3, you might need to install easy_install3 or pip3 respectively before you can use them.

❖ Platform: Windows

Installing bs4 in Windows is a one step process and is very easy. Use the following command to install it using your command line interface.

pip install beautifulsoup4

❂ Some Common Problems After Installation

You might encounter an error if there’s a wrong version being installed. Let us have a look at the reason behind the errors!

ErrorImportError “No module named HTMLParser”

Reason: The error occurs because you are using Python 2 version of the code in Python 3.

ErrorImportError “No module named html.parser”

Reason: The error occurs because you are using Python 3 version of the code in Python 2.

Solution: Remove existing installation and re-install BeautifulSoup.

ErrorSyntaxError “Invalid syntax” on the line ROOT_TAG_NAME = u’[document]’

Solution: Convert Python 2 version of the code to Python 3 by either:

  • Installing the package −
    • python3 setup.py install
  • Manually running Python 2 to Python 3 conversion script (in the bs4 director) –
    • 2to3-3.2 -w bs4

❂ Installing The Parser

Beautiful Soup supports the HTML parser by defaultwhich is included in the Python’s standard library. However it also supports other external or third party Python parsers as shown in the table below:

ParserTypical usageAdvantagesDisadvantages
Python’s html.parserBeautifulSoup(markup, "html.parser")Batteries includedDecent speedLenient (As of Python 2.7.3 and 3.2.)Not as fast as lxml, less lenient than html5lib.
lxml’s HTML parserBeautifulSoup(markup, "lxml")Very fastLenientExternal C dependency
lxml’s XML parserBeautifulSoup(markup, "lxml-xml") BeautifulSoup(markup, "xml")Very fastThe only currently supported XML parserExternal C dependency
html5libBeautifulSoup(markup, "html5lib")Extremely lenientParses pages the same way a web browser doesCreates valid HTML5Very slowExternal Python dependency

Use the following commands to install the lxml or the html5lib parser,

Linux:

$apt-get install python-lxml
$apt-get insall python-html5lib

Windows:

$pip install lxml
$pip install html5lib

Conclusion

With that, we come to the end of this crisp tutorial on how to install the BeautifulSoup library. Please feel free to follow the steps and install it in your system. If you want to learn how to use the BeautifulSoup library and scrape a webpage, please follow this TUTORIAL and have a look at the step by step guide to scrape your webpage.

Please subscribe and stay tuned for more interesting articles!

Where to Go From Here?

Enough theory. Let’s get some practice!

Coders get paid six figures and more because they can solve problems more effectively using machine intelligence and automation.

To become more successful in coding, solve more real problems for real people. That’s how you polish the skills you really need in practice. After all, what’s the use of learning theory that nobody ever needs?

You build high-value coding skills by working on practical coding projects!

Do you want to stop learning with toy projects and focus on practical code projects that earn you money and solve real problems for people?

🚀 If your answer is YES!, consider becoming a Python freelance developer! It’s the best way of approaching the task of improving your Python skills—even if you are a complete beginner.

If you just want to learn about the freelancing opportunity, feel free to watch my free webinar “How to Build Your High-Income Skill Python” and learn how I grew my coding business online and how you can, too—from the comfort of your own home.

Join the free webinar now!