5 Best Ways to Check if a CSV File is Empty in Python

5/5 - (1 vote)

πŸ’‘ Problem Formulation: In numerous data processing tasks, it is crucial to determine whether a CSV (Comma Separated Values) file is empty before performing further operations. An empty CSV file, one devoid of content or data rows, can lead to exceptions or errors if not handled properly. The input is a CSV file, and the desired output is a boolean indication of whether the file is empty or not.

Method 1: Using os.stat()

The os.stat() function in Python provides an interface to retrieve the file system status for a given path. Specifically, it can be used to check the size of a file. An empty file has a size of 0 bytes, which can directly indicate if the file contains any data. This method is effective for quickly determining file emptiness without opening the file.

Here’s an example:

import os

def is_csv_empty(file_path):
    return os.stat(file_path).st_size == 0

empty = is_csv_empty('empty_file.csv')
print(empty)

Output:

True

This code defines a function is_csv_empty() that takes a file path as an argument and returns True if the file is empty, or False otherwise. It uses the os.stat() method to check the file size.

Method 2: Checking with open() and read()

By opening a file and attempting to read content from it, one can easily establish if the file is empty. In Python, the built-in open() function can be used to open a file, and the read() method reads the content. An empty file will return an empty string upon reading.

Here’s an example:

def is_csv_empty(file_path):
    with open(file_path, 'r', encoding='utf-8') as file:
        return file.read() == ''

empty = is_csv_empty('empty_file.csv')
print(empty)

Output:

True

This function opens the file in read-mode and checks if the content read from the file is an empty string, indicating that the file is empty.

Method 3: Using CSV Reader

Python’s csv module provides a way to read and write CSV files. The csv.reader() object reads rows from the CSV file. If there are no rows to read except for possibly a header, the file is empty. This method is particularly useful for CSV files that have a header row.

Here’s an example:

import csv

def is_csv_empty(file_path):
    with open(file_path, 'r', encoding='utf-8') as csvfile:
        reader = csv.reader(csvfile)
        next(reader, None)  # Skip header
        return not any(row for row in reader)

empty = is_csv_empty('empty_with_header.csv')
print(empty)

Output:

True

This code skips the header using next() and checks if there are any remaining rows. The expression not any(row for row in reader) returns True when no data rows are present.

Method 4: Examining Line Count

Another method involves counting the number of lines in the file, which can be done by iterating over the file object. For CSV files, if the line count is zero or one (when header is present), the file can effectively be considered empty.

Here’s an example:

def is_csv_empty(file_path):
    with open(file_path, 'r', encoding='utf-8') as file:
        return len(file.readlines()) <= 1

empty = is_csv_empty('empty_with_one_line_header.csv')
print(empty)

Output:

True

The code opens the file and reads all lines into a list with file.readlines(). Then it checks if the length of the list is less than or equal to 1, indicating the file is empty or only contains a header.

Bonus One-Liner Method 5: Using pathlib

The modern pathlib module in Python provides an object-oriented interface to the filesystem, and its Path class includes a method to check if a file is empty in a succinct one-liner.

Here’s an example:

from pathlib import Path

empty = Path('empty_file.csv').stat().st_size == 0
print(empty)

Output:

True

Similar to Method 1, this check uses the file status information. However, it does so using the more modern Path object, making the code concise and readable.

Summary/Discussion

  • Method 1: Using os.stat(). Strengths: Fast and efficient, doesn’t need to open the file. Weaknesses: Does not distinguish between files with only header and truly empty files.
  • Method 2: Checking with open() and read(). Strengths: Simple and straightforward. Weaknesses: Inefficient for large files as it reads the entire file content to check if it’s empty.
  • Method 3: Using CSV Reader. Strengths: Accurately checks for data rows, ignoring the header. Weaknesses: Slightly more complex, may be an overkill for simple checks.
  • Method 4: Examining Line Count. Strengths: Works well for files with headers. Weaknesses: Inefficient for large files, as it loads all lines into memory.
  • Bonus Method 5: Using pathlib. Strengths: Modern, clean syntax. Weaknesses: Like Method 1, does not account for headers.