5 Best Ways to Compress Files With zipfile Module in Python

πŸ’‘ Problem Formulation: Developers often need to programmatically compress files to save on storage space or to make file transfers more efficient. The Python zipfile module provides functionalities for reading, writing, and creating ZIP archives. This article explains how to use the zipfile module to compress various types of files. As an example, we will explore the compression of a collection of text files into a single ZIP file, which should result in a smaller combined file size.

Method 1: Using zipfile to Create a New ZIP File

This method involves initializing a new ZIP file and then adding files one by one to the archive. The ZipFile class in the zipfile module is used to create a writeable archive, and files are added using the write() method.

Here’s an example:

import zipfile

with zipfile.ZipFile('example.zip', 'w') as myzip:
    myzip.write('document1.txt')
    myzip.write('image1.png')

Output: An ‘example.zip’ file is created containing ‘document1.txt’ and ‘image1.png’

In this snippet, ‘example.zip’ is the name of the new ZIP file being created, and the ‘w’ parameter specifies that we are opening the file in write mode. We then write two files ‘document1.txt’ and ‘image1.png’ into this ZIP file. Note that the files must exist in the same directory as the script or you must specify their paths.

Method 2: Adding Files with Compression

To further reduce the file size, Python’s zipfile module allows specifying the compression algorithm. Using the ZIP_DEFLATED option, we can add files with deflated compression to make the archive even smaller.

Here’s an example:

import zipfile

with zipfile.ZipFile('example.zip', 'w', zipfile.ZIP_DEFLATED) as myzip:
    myzip.write('document1.txt')

Output: A more compressed ‘example.zip’ file is created, containing ‘document1.txt’

Here, we open ‘example.zip’ with the additional parameter zipfile.ZIP_DEFLATED to apply the standard compression algorithm, which effectively reduces the file size more than just storing the file uncompressed. This method is great for text and other deflate-friendly data.

Method 3: Adding Multiple Files from a Directory

When working with directories full of files, you can compress all of them into a ZIP file using a loop. This avoids manually adding each file.

Here’s an example:

import zipfile
import os

with zipfile.ZipFile('example.zip', 'w') as myzip:
    for foldername, subfolders, filenames in os.walk('my_folder'):
        for filename in filenames:
            myzip.write(os.path.join(foldername, filename))

Output: ‘example.zip’ containing all files from ‘my_folder’

This script compresses every file within ‘my_folder’ into ‘example.zip’. The os.walk() function walks through the directory tree, and for each file encountered, it adds the file to the ZIP archive. This is an efficient way to handle batch file compression.

Method 4: Using Context Manager for Cleaner Code

Python’s context managers make code cleaner and more readable. By using a context manager, we ensure that the ZIP file is properly closed after its creation, even if errors occur during the file operations.

Here’s an example:

import zipfile

with zipfile.ZipFile('example.zip', 'w') as myzip:
    myzip.write('document1.txt')
    # Add more file operations here

Output: ‘example.zip’ with the file ‘document1.txt’

Note that this is not a different method of compression but a coding practice that enhances resource management. By using the with statement, the ZIP file is automatically closed after the block of code is executed, which mitigates the risk of leaving files opened unintentionally.

Bonus One-Liner Method 5: Compress a File in a Single Statement

For quick tasks where you want to zip a single file without fuss, Python allows you to compress files using a one-liner command.

Here’s an example:

zipfile.ZipFile('example.zip', 'w').write('document1.txt')

Output: ‘example.zip’ with ‘document1.txt’

This code does everything in one line without explicitly opening or closing the ZIP file. This one-liner can be useful in scripts or command line operations where simplicity and brevity are desired. However, this should be used with caution, as it may not close the ZIP file properly if an error occurs.

Summary/Discussion

  • Method 1: Creating a New ZIP File. Straightforward and easy to use. Less efficient for multiple files.
  • Method 2: Adding Files with Compression. Efficient in reducing file size. Requires understanding of compression formats.
  • Method 3: Adding Multiple Files from a Directory. Streamlines batch processing. May include unwanted files if not handled carefully.
  • Method 4: Using Context Manager. Clean and safe. It’s more of a best practice than a compression method.
  • Bonus Method 5: One-Liner. Quick and easy. Does not ensure proper resource management.