Story: This series of articles assume you work in the IT Department of Mason Books. The Owner asks you to scrape the website of a competitor. He would like this information to gain insight into his pricing structure.
π‘Β Note: Before continuing, we recommend you possess, at minimum, a basic knowledge of HTML and CSS and have reviewed our articles on How to Scrape HTML tables.
β₯οΈ Info: Are you AI curious but you still have to create real impactful projects? Join our official AI builder club on Skool (only $5): SHIP! - One Project Per Month
What You’ll Build in This Project
Let’s navigate to Books to Scrape and review the format.

At first glance, you will notice:
- Book categories display on the left-hand side.
- There are, in total, 1,000 books listed on the website.
- Each web page shows 20 Books.
- Each price is in Β£ (in this instance, the UK pound).
- Each Book displays minimum details.
- To view complete details for a book, click on the image or the
Book Titlehyperlink. This hyperlink forwards to a page containing additional book details for the selected item (see below). - The total number of website pages displays in the footer (
Page 1 of 50).
Step 1: Install and Import Libraries for Project
- The Pandas library enables access to/from a DataFrame.
- The Requests library provides access to the HTTP requests in Python.
- The Beautiful Soup library enables data extraction from HTML and XML files.
To install these libraries, navigate to an IDE terminal. At the command prompt ($), execute the code below. For the terminal used in this example, the command prompt is a dollar sign ($). Your terminal prompt may be different.
$ pip install pandas
Hit the <Enter> key on the keyboard to start the installation process.
$ pip install requests
Hit the <Enter> key on the keyboard to start the installation process.
$ pip install beautifulsoup4
Hit the <Enter> key on the keyboard to start the installation process.
If the installations were successful, a message displays in the terminal indicating the same.
Feel free to view the PyCharm installation guides for the required libraries.
- How to install Pandas on PyCharm
- How to install Requests on PyCharm
- How to install BeautifulSoup4 on PyCharm
Add the following code to the top of each code snippet. This snippet will allow the code in this article to run error-free.
import pandas as pd import requests from bs4 import BeautifulSoup import time import urllib.request from csv import reader, writer
- The
timelibrary is built-in with Python and does not require installation. This library containstime.sleep()and is used to set a delay between page scrapes. - The
urlliblibrary is built-in with Python and does not require installation. This library containsurllib.requestand is used to save images. - The
csvlibrary is built-inPandasand does not require additional installation. This library containsreader and writermethods to save data to a CSV file.
Step 2: Understand Basics and Scrape Your First Results

In this step, you’ll perform the following tasks:
- Reviewing the website to scrape.
- Understanding HTTP Status Codes.
- Connecting to the Books to Scrape website using the
requestslibrary. - Retrieving Total Pages to Scrape
- Closing the Open Connection.
π Learn More: Learn everything you need to know to reproduce this step in the in-depth Finxter blog tutorial.
Step 3: Configure URL to Scrape and Avoid Spamming the Server

Rule: Don’t Spam the Server!
In this step, you’ll perform the following tasks:
- Configuring a page URL for scraping
- Setting a delay:
time.sleep()to pause between page scrapes. - Looping through two (2) pages for testing purposes.
π Learn More: Learn everything you need to know to reproduce this step in the in-depth Finxter blog tutorial.
Step 4: Save Book Details in a Python List

In this step, you’ll perform the following tasks:
- Locating Book details.
- Writing code to retrieve this information for all Books.
- Saving
Bookdetails to a List.
π Learn More: Learn everything you need to know to reproduce this step in the in-depth Finxter blog tutorial.
Step 5: Clean and Save the Scraped Output

In this step, you’ll perform the following tasks:
- Cleaning up the scraped code.
- Saving the output to a CSV file.
π Learn More: Learn everything you need to know to reproduce this step in the in-depth Finxter blog tutorial.
Conclusion
This tutorial has guided you through the steps to create your first practical web scraping project: scraping the contents of a book store!
Now, go out and use your skills wisely and to the benefit of humanity, my friend! π