Python Requests Library - Your First HTTP Request in Python

This is the first part of a 3-part series on the Python request library:

Syntax

requests.nameofmethod(parameters)

Background

There are many libraries around that make HTTP requests. However, the requests library seems to be the most popular.

When the requests library sends a URL, the following occurs:

A DNS lookup converts the URL to an IP address (example: 312.245.123.21),
The request library sends a request to this IP address,
The server attempts to validate this request,
The server returns a status code as shown below.

💡Note: The URL https://books.toscrape.com used for some examples in this article welcomes coders and encourages scraping.

Preparation

Before any requests can occur, one (1) new library will require installation.

The Requests library allows access to its many methods and makes data manipulation a breeze!

To install this library, navigate to an IDE terminal. At the command prompt ($), execute the code below. For the terminal used in this example, the command prompt is a dollar sign ($). Your terminal prompt may be different.

$ pip install requests

Hit the <Enter> key on the keyboard to start the installation process.

If the installation was successful, a message displays in the terminal indicating the same.

Feel free to view the PyCharm installation guide for the required library.

How to install Requests on PyCharm

Add the following code to the top of each code snippet. This snippet will allow the code in this article to run error-free.

import requests

Status Codes

Direct quote from Wikipedia:

HTTP response status codes separate into five classes or categories. The first digit of the status code defines the class of response. The last two digits do not have any classifying or categorization role. These five classes are:

1XX	Informational Response	The request was received, continuing process.
2XX	Success	The request was successfully received, understood & accepted.
3XX	Redirection	Further action is needed to complete the request.
4XX	Client Error	The requests contain invalid syntax or incomplete data.
5XX	Server Error	The server failed to fulfill a valid request.

The “get” Request: Making a Request

This method uses the GET Request to connect to a website. This function takes a URL as an argument. In this example, a status code returns and displays the status of the connection (success/failure). If invalid, the script abruptly ends.

Run this script. If successful, a status code starting with 2XX outputs to the terminal.

response = requests.get('https://books.toscrape.com')
print(response.status_code)
response.close()

Line [1] attempts to connect to the URL.
Line [2] outputs the status code. Both lines do the same thing.
Line [3] closes the open connection.

response = requests.get('https://books.toscrape.com')
print(requests.codes.ok)
response.close()

Output

200
200

As mentioned above, if your status code is other than 200, there is a good chance the script will fail. To prevent this, wrap the code in a try/except statement.

try:
    response = requests.get('https://books.toscrape.com')
    print('OK')
    response.close()
except:
    print('Error')

Line [1] initializes the try statement. The code inside here will run first.
- Line [2] performs a GET request to connect to the URL.
- Line [3] if successful, OK is output to the terminal.
- Line [4] closes the open connection.
Line [5] is the except statement. If the try statement fails, the code falls to here.
- Line [6] outputs the message Error to the terminal. The script terminates.

The “get” Request: Response Content

When the code shown below runs, the HTML code on the requested web page is output to the terminal.

try:
    response = requests.get('https://books.toscrape.com')
    print(response.text)
    response.close()
except:
    print('Error')

Line [1] initializes the try statement. The code inside here will run first.
- Line [2] performs a GET request to connect to the URL.
- Line [3] if successful, OK is output to the terminal.
- Line [4] closes the open connection.
Line [5] is the except statement. If the try statement fails, the code falls to here.
- Line [6] outputs Error to the terminal. The script terminates.

Output

A small portion of the HTML code displays below.

<article class="product_pod">
<div class="image_container">
<a href="catalogue/the-boys-in-the-boat-nine-americans-and-their-epic-quest-for-gold-at-the-1936-berlin-olympics_992/index.html"><img src="media/cache/66/88/66883b91f6804b2323c8369331cb7dd1.jpg" alt="The Boys in the Boat: Nine Americans and Their Epic Quest for Gold at the 1936 Berlin Olympics" class="thumbnail"></a>
</div>
...

Using “timeout”

This method allows the coder to set how long the code will wait before timing out for:

a connection
a response

In the example below, the connection time equals 2 seconds. The response time equals 4 seconds.

The best practice is to add the timeout parameter to every request made.

💡Note: If not entered, the code can hang up to two minutes before crashing. Browser-dependent.

try:
    response = requests.get('https://books.toscrape.com', timeout=(2, 4))
    print(response.text)
    response.close()
except:
    print('Error')

Line [1] initializes the try statement. The code inside here will run first.
- Line [2] performs a GET request to connect to the URL and sets a timeout.
- Line [3] if the response is successful, the HTML code from the URL outputs to the terminal.
- Line [4] closes the open connection.
Line [5] is the except statement. If the try statement fails, the code falls to here.
- Line [6] outputs Error to the terminal. The script automatically terminates.

Output

See above.

Summary

In this article, we learned how to:

Connect to a URL
Retrieve and display status codes
Output the HTML code to the terminal
Use the try/except statement to catch errors
Set a timeout
Close any open connections

Next Up

Part 2 will continue to focus on GET as follows:

The “get “Request: “params”
The “get “Request: “allow_redirects”
The “get “Request: “auth”
The “get “Request: “cert” and “verify”
The “get “Request: “`cookies “