5 Ways to Read a Text File from a URL

Problem Formulation and Solution Overview

In this article, you’ll learn how to read a text file from a URL in Python.

To make it more fun, we have the following running scenario:

Let’s assume you are a student and have been asked to write an essay on the Northern Lights. The data you require is saved as a text file at a specified URL. You will need to write code to access this URL and read the file contents.

πŸ’¬ Question: How would we write Python code to read a text file from a URL?

We can accomplish this task by one of the following options:


Preparation

Before any requests can occur, one (1) new library will require installation.

  • The Requests library allows access to its many methods and makes data manipulation a breeze!

To install this library, navigate to an IDE terminal. At the command prompt ($), execute the code below. For the terminal used in this example, the command prompt is a dollar sign ($). Your terminal prompt may be different.

$ pip install requests

Hit the <Enter> key on the keyboard to start the installation process.

If the installation was successful, a message displays in the terminal indicating the same.


Feel free to view the PyCharm”installation guide for the required library.


Add the following code to the top of each code snippet. This snippet will allow the code in this article to run error-free.

import urllib.request
import requests
import urllib
from urllib.request import urlopen
import urllib3

Method 1: Use urllib.request().urlopen()

This method calls in the urllib.request library and uses the urlopen() function to open a specified URL, read the contents one line at a time, and apply the appropriate decoding (decode('utf-8')) option to the same.

file_url = 'https://raw.githubusercontent.com/finxter/FinxterTutorials/main/nlights.txt'

for line in urllib.request.urlopen(file_url):
    print(line.decode('utf-8')) 

This code declares a URL where nlights.txt is located and saves this location to file_url.

Next, a For loop is instantiated to access the specified URL and read the contents in one line at a time. The lines are output to the terminal after applying decode('utf-8').

πŸ’‘Β Note: Click here for more additional information on decoding.

Output

The output from this method is a String Data Type with blank lines separating the paragraphs.

The northern lights or the aurora borealis are beautiful dancing waves of light that have captivated people for millennia. But for all its beauty, this spectacular light show is a rather violent event.

Energized particles from the sun slam into Earth's upper atmosphere at speeds of up to 45 million mph (72 million kph), but our magnetic field protects us from the onslaught.

Earth's magnetic field redirects the particles toward the poles that transform into a cinematic atmospheric phenomenon that dazzles and fascinates scientists and skywatchers alike.

Method 2: Use requests()

This method calls in the requests library and uses get() to access the text file located at the specified URL, read the contents in one line at a time, and output as a Tuple.

file_url = 'https://raw.githubusercontent.com/finxter/FinxterTutorials/main/nlights.txt'
response = requests.get(file_url)

if (response.status_code):
    data = response.text
    for line in enumerate(data.split('\n')):
        print(line)

This code declares a URL where nlights.txt is located and saves this location to file_url.

Then response is declared. This line attempts to connect to the URL shown above and return a response object. If successful, the following returns.

<Response [200]>

Next, the code tests to see if the response.status_code is 200 (successful connection). If true, the code inside the if statement executes as follows.

  • The variable data retrieves and saves all the text inside the nlights.txt file.
  • A for loop is instantiated to read in one line at a time, splitting the line on the newline character ('\n‘) and output each line using print formatting.

Output

The output for this method is four (4) Tuples, each containing a line number and the contents of the corresponding paragraph.

(0, 'The northern lights or the aurora borealis are beautiful dancing waves of light that have captivated people for millennia. But for all its beauty, this spectacular light show is a rather violent event. ')
(1, "Energized particles from the sun slam into Earth's upper atmosphere at speeds of up to 45 million mph (72 million kph), but our magnetic field protects us from the onslaught. ")
(2, "Earth's magnetic field redirects the particles toward the poles that transform into a cinematic atmospheric phenomenon that dazzles and fascinates scientists and skywatchers alike.")
(3, '')

πŸ’‘ Note: This code appends an empty blank line ((3, '')) to the output.


Method 3: Use urllib3.PoolManager()

This method calls the urllib and urllib3 libraries then create a urllib3.PoolManager() object. From this object, the code attempts to get the contents (http.request('GET', file_url)), and apply the appropriate decoding (decode('utf-8')) option to the same.

file_url = 'https://raw.githubusercontent.com/finxter/FinxterTutorials/main/nlights.txt'
http     = urllib3.PoolManager()
response = http.request('GET', file_url)
data     = response.data.decode('utf-8')
print(data)

This code declares a URL where nlights.txt is located and saves this location to file_url.

Then the http variable creates a urllib3.PoolManager object similar to below.

<urllib3.poolmanager.PoolManager object at 0x0000020CC37071F0>

Next, an HTTP request is sent to get (‘GET‘) the contents from the specified URL and save the results to response.

Finally, the data from response is decoded using (‘utf-8‘) decoding and output to the terminal.

Output

The output from this method is a String Data Type with no blank lines separating the paragraphs.

The northern lights or the aurora borealis are beautiful dancing waves of light that have captivated people for millennia. But for all its beauty, this spectacular light show is a rather violent event.
Energized particles from the sun slam into Earth's upper atmosphere at speeds of up to 45 million mph (72 million kph), but our magnetic field protects us from the onslaught.
Earth's magnetic field redirects the particles toward the poles that transform into a cinematic atmospheric phenomenon that dazzles and fascinates scientists and skywatchers alike.

Method 4: Use urllib.request.urlopen().read(n)

This method calls in the urllib.request library and creates a
one-liner to connect to the specified URL, read a specified number of file characters, and apply the appropriate decoding (decode('utf-8')) option to the same.

file_url = 'https://raw.githubusercontent.com/finxter/FinxterTutorials/main/nlights.txt'
data    = urlopen(file_url).read(203).decode('utf-8')
print(data)

This code accesses the specified URL, file_url and reads in the first 203 characters. In this case, this is the first paragraph of the file. The contents are then decoded (‘utf-8‘), saved to data and output to the terminal.

Output

The output from this method is a String Data Type containing the first paragraph from the file.

The northern lights or the aurora borealis are beautiful dancing waves of light that have captivated people for millennia. But for all its beauty, this spectacular light show is a rather violent event.

Method 5: Use urllib.request.urlopen().read()

This method calls in the urllib.request library and creates a one-liner to connect to the specified URL, read the entire contents, and apply the appropriate decoding (decode('utf-8')) option to the same.

file_url = 'https://raw.githubusercontent.com/finxter/FinxterTutorials/1b754ac4eb0c9ee59fefa5008baf1ee6bfb9cc26/nlights.txt'
data = urlopen(file_url).read().decode('utf-8')
print(data)

This code declares a URL where nlights.txt is located and saves this location to file_url.

On one line, the specified URL is opened, read in, decoded, and saved to data. The output is then sent to the terminal.

Output

The output from this method is a String Data Type with no blank lines separating the paragraphs.

The northern lights or the aurora borealis are beautiful dancing waves of light that have captivated people for millennia. But for all its beauty, this spectacular light show is a rather violent event.
Energized particles from the sun slam into Earth's upper atmosphere at speeds of up to 45 million mph (72 million kph), but our magnetic field protects us from the onslaught.
Earth's magnetic field redirects the particles toward the poles that transform into a cinematic atmospheric phenomenon that dazzles and fascinates scientists and skywatchers alike.

Summary

These five (5) methods of how to read a text file from a URL should give you enough information to select the best one for your coding requirements.

Good Luck & Happy Coding!