How to Scrape Google Search Results?

5/5 - (1 vote)

Problem Formulation

💬 Given a text query/keyword such as "History of Chess". How to scrape the top Google results for that search query (=keyword) in Python?

Disclaimer: Have a look at the important question: Is Web Scraping Legal?

Method Summary

You can get the top Google search results given a certain keyword string by installing and importing the seo module from the ecommercetools library and running its seo.get_serps(keyword) function to retrieve a Pandas dataframe with the first couple of search results.

Install ecommercetools

First, install the library using the following command in your programming environment:

pip3 install --upgrade ecommercetools

Alternatively, run the following cell in your Jupyter notebook (e.g., using Google Colab):

!pip3 install --upgrade ecommercetools

Get the SERP Data into a Pandas DataFrame

Second, these four lines of code will scrape the first few Google results from the search engine result page (SERP):

  1. from ecommercetools import seo
  2. keyword = "history of chess"
  3. results = seo.get_serps(keyword)
  4. print(results)

Here’s how that looks like in Python code:

from ecommercetools import seo
keyword = "history of chess"
results = seo.get_serps(keyword)
print(results)

If you print the DataFrame, it looks a bit confusing—but don’t worry, I’ll show you how to access its content easily in the remaining article!

   position                                              title  \
0         1  History of chess - Simple English Wikipedia, t...   
1         2                       History of chess - Wikipedia   
2         3     History of Chess | From Early Stages to Magnus   
3         4          chess - History - Encyclopedia Britannica   
4         5  The History of Chess - Who Invented the Game o...   
5         6  Chess 101: Who Invented Chess? Learn About the...   
6         7                               The History of Chess   
7         8                           A Brief History of Chess   
8         9  Explore the History of Chess From Ancient Indi...   

                                                link  \
0  https://simple.wikipedia.org/wiki/History_of_c...   
1     https://en.wikipedia.org/wiki/History_of_chess   
2  https://www.chess.com/article/view/history-of-...   
3     https://www.britannica.com/topic/chess/History   
4      https://www.ichess.net/blog/history-of-chess/   
5  https://www.masterclass.com/articles/chess-101...   
6  https://saintlouischessclub.org/Media/08-The-H...   
7  https://premierchess.com/chess-culture/a-brief...   
8          https://mymodernmet.com/history-of-chess/   

                                                text              bold  
0                                                                       
1  The history of chess can be traced back nearly...  history of chess  
2  2019年9月30日 — Chess, as we know it today, was b...             chess  
3  The origin of chess remains a matter of contro...             chess  
4  The history of chess can be traced back around...  history of chess  
5  2021年8月24日 — Where Did Chess Originate? The ea...             chess  
6  If you look at the way a chessboard is set up ...             chess  
7  Chess is one of the oldest games that is still...             chess  
8  2020年12月26日 — Chess likely arrived in early me...             chess  

Doesn’t look too pretty, does it? Let’s make some sense out of the data and explore the DataFrame a bit further! 🔎

Getting the Titles of the Top Search Results

You can get the titles of the first search results for the given keyword by using the DataFrame’s column access result.title or result['title']. Both provide you with a Series of search result titles.

Here’s our example:

print(results.title) 
# or: 
print(results['title'])

Output:

0    History of chess - Simple English Wikipedia, t...
1                         History of chess - Wikipedia
2       History of Chess | From Early Stages to Magnus
3            chess - History - Encyclopedia Britannica
4    The History of Chess - Who Invented the Game o...
5    Chess 101: Who Invented Chess? Learn About the...
6                                 The History of Chess
7                             A Brief History of Chess
8    Explore the History of Chess From Ancient Indi...
Name: title, dtype: object

Looks a bit cleaner. But let’s move on to exploring the DataFrame further!

Scraping the i-th Search Results’ Title

You can get the i-th search results’ title by using the row indexing approach. For example, the expression results['title'][0] will yield the string title of the first search result whereas the expression results['title'][3] yields the string title of the fourth result.

Here’s an example:

print(results['title'][0])
# History of chess - Simple English Wikipedia, the free ...

print(results['title'][3])
# chess - History - Encyclopedia Britannica

Of course, this works only for the first couple of search results because only the first SERP is extracted from Google.

Scraping the i-th Search Results’ URL

You can get the i-th search results’ URL by using the row indexing approach. For example, the expression results['link'][0] will yield the string URL of the first search result whereas the expression results['link'][3] yields the string URL of the fourth result.

Here’s an example:

print(results['link'][0])
# https://simple.wikipedia.org/wiki/History_of_chess

print(results['link'][3])
# https://www.britannica.com/topic/chess/History

Of course, this works only for the first couple of search results because only the first SERP is extracted from Google.

More Advanced Ways to Scrape the Google SERP

Here’s an example excerpt from the SERP:

And here’s the overall process on how to extract various information from the SERP:

Procedure to Automate Google Search

Feel free to have a look at our in-depth resource on how to scrape various information from the Google SERP such as “People also asked”, the Google Knowledge Graph, Videos, Related Searches, and other data using BeautifulSoup, Selenium, and other Python libraries:

Where to Go From Here?

Enough theory. Let’s get some practice!

Coders get paid six figures and more because they can solve problems more effectively using machine intelligence and automation.

To become more successful in coding, solve more real problems for real people. That’s how you polish the skills you really need in practice. After all, what’s the use of learning theory that nobody ever needs?

You build high-value coding skills by working on practical coding projects!

Do you want to stop learning with toy projects and focus on practical code projects that earn you money and solve real problems for people?

🚀 If your answer is YES!, consider becoming a Python freelance developer! It’s the best way of approaching the task of improving your Python skills—even if you are a complete beginner.

If you just want to learn about the freelancing opportunity, feel free to watch my free webinar “How to Build Your High-Income Skill Python” and learn how I grew my coding business online and how you can, too—from the comfort of your own home.

Join the free webinar now!