β Disclaimer: This tutorial considers that you have the basic knowledge of web scraping. The purpose of this article is to educate you on how to scrape content from websites with pagination. The examples and theories mentioned in this tutorial are solely for educational purposes and it is considered that you will not misuse them. In case of any missuse, it is solely your responsibility, and we are not responsible for it. If you are interested in learning the basic concepts of web scraping before diving into this tutorial, please follow the lectures at this link.
What is Pagination in a Website?
Pagination refers to the division of entire web content into numerous web pages and displaying the content page by page for proper visualization and also to provide a better user experience. Pagination can be handled either on the client end or the server end.
β₯οΈ Info: Are you AI curious but you still have to create real impactful projects? Join our official AI builder club on Skool (only $5): SHIP! - One Project Per Month
While building a web scraper, it can be extremely challenging to scrape content if the website has implemented pagination. In this tutorial, we will learn about the different types of pagination in websites and how to scrape content from them.
Pagination Types
Pagination can be implemented in numerous ways, but most websites implement one of these types of pagination:
- Pagination with Next button.
- Pagination without Next button.
- Infinite Scroll
- The Load More Button
Pagination with Next Button
The following example demonstrates a website that has the next button. Once the next button is clicked, it loads the next page.

Approach: The following video demonstrates how to scrape the above website.
Code:
# 1. Import the necessary LIBRARIES
import requests
from bs4 import BeautifulSoup
from urllib.parse import urljoin
# 2. Create a User Agent (Optional)
headers = {"User-Agent": "Mozilla/5.0 (Linux; U; Android 4.2.2; he-il; NEO-X5-116A Build/JDQ39) AppleWebKit/534.30 ("
"KHTML, like Gecko) Version/4.0 Safari/534.30"}
# 3. Define Base URL
url = 'http://books.toscrape.com/catalogue/category/books/default_15/index.html'
# 4. Iterate as long as pages exist
while True:
# 5. Send get() Request and fetch the webpage contents
response = requests.get(url, headers=headers)
# 4. Check Status Code (Optional)
# print(response.status_code)
# 6. Create a Beautiful Soup Object
soup = BeautifulSoup(response.content, "html.parser")
# 7. Implement the Logic.
# (extract the footer)
footer = soup.select_one('li.current')
print(footer.text.strip())
# Find next page element if present.
next_page = soup.select_one('li.next>a')
if next_page:
next_url = next_page.get('href')
url = urljoin(url, next_url)
# break out if no next page element is present
else:
breakOutput:
Page 1 of 8 Page 2 of 8 Page 3 of 8 Page 4 of 8 Page 5 of 8 Page 6 of 8 Page 7 of 8 Page 8 of 8
Pagination Without Next Button
The following example demonstrates a website that has no next button. Instead, it uses page numbers to allow navigation. Once a particular page number is clicked, it loads the corresponding page.

Approach: The following video demonstrates how to scrape the above website.
Code:
# 1. Import the necessary LIBRARIES
import requests
from bs4 import BeautifulSoup
from urllib.parse import urljoin
# 2. Create a User Agent (Optional)
headers = {"User-Agent": "Mozilla/5.0 (Linux; U; Android 4.2.2; he-il; NEO-X5-116A Build/JDQ39) AppleWebKit/534.30 ("
"KHTML, like Gecko) Version/4.0 Safari/534.30"}
# 3. Define Base URL
url = 'https://www.gosc.pl/doc/791526.Zaloz-zbroje/'
# 5. Send get() Request and fetch the webpage contents
response = requests.get(url,headers=headers)
# 4. Check Status Code (Optional)
# print(response.status_code)
# 6. Create a Beautiful Soup Object
soup = BeautifulSoup(response.content, 'html.parser')
# 7. Implement the Logic.
img_src = [img['src'] for img in soup.select('.txt__rich-area img')]
print('https://www.gosc.pl/'+img_src[0])
page = soup.select('span.pgr_nrs a')
flag = 0
for i in range(len(page)):
next_page = page[flag].text
flag+=1
url = urljoin(url, next_page) # iteration 1: https://www.gosc.pl/doc/791526.Zaloz-zbroje/2
response = requests.get(url, headers=headers)
soup = BeautifulSoup(response.content, "html.parser")
img_src = [img['src'] for img in soup.select('.txt__rich-area img')]
for i in img_src:
if i.endswith('jpg'):
print('https://www.gosc.pl/'+i)Output:
https://www.gosc.pl//files/old/gosc.pl/elementy/gn23s18_kolumbA.jpg https://www.gosc.pl//files/old/gosc.pl/elementy/gn23s18_kolumbB.jpg https://www.gosc.pl//files/old/gosc.pl/elementy/gn23s18_kolumbC.jpg https://www.gosc.pl//files/old/gosc.pl/elementy/gn23s18_kolumbD.jpg https://www.gosc.pl//files/old/gosc.pl/elementy/gn23s18_kolumbE.jpg https://www.gosc.pl//files/old/gosc.pl/elementy/gn23s18_kolumbF.jpg
Infinite Scroll

Approach: The following video demonstrates how to scrape the above website.
Code:
# 1. Import the necessary LIBRARIES
import requests
# 2. Create a User Agent (Optional)
headers = {"User-Agent": "Mozilla/5.0 (Linux; U; Android 4.2.2; he-il; NEO-X5-116A Build/JDQ39) AppleWebKit/534.30 ("
"KHTML, like Gecko) Version/4.0 Safari/534.30"}
# 3. Define Base URL
url = 'https://pharmeasy.in/api/otc/getCategoryProducts?categoryId=877&page='
page_number = 1
try:
while True:
# 4. Send get() Request and fetch the webpage contents
response = requests.get(url + str(page_number), headers=headers)
# 5. Extract the json data from the page
data = response.json()
# 6. The Logic
name = []
price = []
if len(data['data']['products']) == 0:
break
else:
for d in data['data']['products']:
print(d['name'])
page_number += 1
except:
passPagination with Load More Button

Approach: Please follow the entire explanation in the following video lecture that explains how you can scrape data from websites that have implemented pagination with the help of the load more button.
Code:
# 1. Import the necessary LIBRARIES
import requests
# 2. Create a User Agent (Optional)
headers = {"User-Agent": "Mozilla/5.0 (Linux; U; Android 4.2.2; he-il; NEO-X5-116A Build/JDQ39) AppleWebKit/534.30 ("
"KHTML, like Gecko) Version/4.0 Safari/534.30"}
# 3. Define Base URL
url = 'https://smarthistory.org/wp-json/smthstapi/v1/objects?tag=938&page={}'
# 4. The Logic
pg_num = 1
title = []
while True:
response = requests.get(url.format(pg_num), headers=headers)
data = response.json()
d = data['posts']
for i in d:
for key,value in i.items():
if key == 'title':
title.append(value.strip())
if data.get('remaining') and int(data.get('remaining')) > 0:
pg_num += 1
else:
break
# print extracted data
for i in title:
print(i)
One of the most sought-after skills on Fiverr and Upwork is web scraping .
Make no mistake: extracting data programmatically from web sites is a critical life-skill in todayβs world thatβs shaped by the web and remote work.
This course teaches you the ins and outs of Pythonβs BeautifulSoup library for web scraping.