How to Extract a Zip File in Python

5/5 - (10 votes)

Problem Formulation and Solution Overview

This article will show you how to extract a zip file in Python.

A ZIP file is an archive file format that compresses one or more files into one primary location. This file type is known to quickly and easily transfer large amounts of data.

Let’s say you have been asked to download the latest U.S. Zip Codes and Canadian Postal Code CSV files. These files are located inside a zip file at a specified URL. To follow along, download this zip file and save it into the current working directory.


πŸ’¬ Question: How would we write code to extract the contents of a Zip file?

We can accomplish this task by one of the following options:


Method 1: Use extractall()

This example imports Python’s built-in zipfile library and then calls the extractall() function to extract the contents of a zip file.

from zipfile import ZipFile
zip_file = 'ZipCodeFiles.zip'
  
with ZipFile(zip_file, 'r') as zp:
    zp.extractall()
    zp.printdir()

The first line in the above code imports the ZipFile function from the zipfile library. This allows for file extraction.

The following line declares the name of the zip file downloaded earlier. This file should reside in the current working directory.

Next, this file is opened in read (r) mode, and a file object, zp, is created. This object provides access to and manipulation of files. If output to the terminal, an object similar to the one below will display.

<zipfile.ZipFile filename='ZipCodeFiles.zip' mode='r'>

πŸ’‘Note: To perform file manipulations, the object created above must first be referenced, then a function to apply to this object is appended (example, zp.extractall()).

The following line calls extractall() to extract the zip file’s contents. These files are placed inside the current working directory.

The printdir() function is called and outputs a list of the extracted files to the terminal, as shown below.

File NameModifiedSize
CanadianPostalCodes202208.csv2022-08-31 08:46:2251874003
USZIPCodes202208.csv2022-08-31 08:46:224521460

An interesting point is that if you output the contents of the file object zp, at this point, an object similar to the one below will display.

<zipfile.ZipFile [closed]>

Notice that the file object is now closed, indicating that no further file manipulation can occur.


Method 2: Use unpack_archive()

This example imports the shutil library and then calls the shutil.unpack_archive() function to extract the zip file.

import shutil

zip_file = 'ZipCodeFiles.zip'
shutil.unpack_archive(zip_file, 'csv'

The first line in the above code imports the shutil library. This library offers numerous file operations, including unpack.archive(). This function extracts a zip (archive) file.

πŸ‘‰ Recommended Tutorial: Python shutil: High-Level File Operations Demystified

The following line declares the name of the zip file downloaded earlier. This file should reside in the current working directory.

The next line calls the shutil.unpack_archive() function and passes it into two (2) arguments: the name of the zip file saved earlier and the path or folder to which to save the extracted files. In this case, the CSV folder.

Once this code has been run, the following two (2) CSV files should now reside in the CSV folder.

CSV
CanadianPostalCodes202208.csv
USZIPCodes202208.csv

Method 3: Use Shell Command

This example runs at the command prompt in the shell and imports the zipfile library to extract the zip file.

$ python -m zipfile -e 'ZipCodeFiles.zip' 'csv'

Navigate to the command prompt. In this example, our command prompt has a $ sign. Your command prompt may differ.

This line imports the zipfile library, and uses -e to indicate extract the zip file ZipCodeFiles.zip into the CSV directory.

Once this code has been run, the following two (2) CSV files should now reside in the CSV folder.

CSV
CanadianPostalCodes202208.csv
USZIPCodes202208.csv

Bonus: Extract Multiple Zip Files

What happens if you have multiple zip files? What method can you use to extract these file types?

import zipfile
import os

os.chdir('zipped')

for file in os.listdir('.'):
    with zipfile.ZipFile(file, 'r') as zp:
        zp.extractall('.')

The first line in the above code imports the ZipFile function from the zipfile library. This allows for file extraction on single or multiple zip files. The os library is also imported to allow the interaction of files and directories.

The following line changes the directory to zipped, where all the zip files for this example are located. The contents of this folder are shown below.

zipped
zip_file_1.zip
zip_file_2.zip
zip_file_3.zip

Next, a for loop is instantiated to iterate through each file inside this folder (os.listdir('.')) shown above. Upon each iteration, the appropriate zip file is opened in read (r) mode and extracted to the current directory. Which is now zipped.

zipped
file_1.csv
file_2.csv
file_3.csv

zip_file_1.zip
zip_file_2.zip
zip_file_3.zip

Summary

This article has provided four (4) ways to extract a zip file to select the best fit for your coding requirements.

Good Luck & Happy Coding!


Programming Humor – Python

“I wrote 20 short programs in Python yesterday. It was wonderful. Perl, I’m leaving you.”xkcd