Python Get All TXT Files in a Folder

Imagine you have a project that requires you to process tons of text files, and these files are scattered throughout your folder hierarchy. By the time you finish reading this article, you’ll be equipped with the knowledge to efficiently fetch all the .txt files in any folder using Python.

Method 1: The Os Module

The os module can be used to interact effectively with the file system. The method os.listdir() lists all files and directories in your target folder. You’ll use this method along with a for loop and the endswith() method to filter .txt files specifically.

Here’s the code snippet:

import os

directory = './your_folder/'
txt_files = []

for file in os.listdir(directory):
    if file.endswith('.txt'):
        txt_files.append(file)

print(txt_files)

This code imports the os module, sets the target directory, and initializes an empty list.

The for loop iterates through all the files and checks for the .txt extension using the endswith() method. Matching files are added to the list, which is printed at the end.

Method 2: The Glob Module (My Fav πŸ’«)

Another solution involves using the glob module, which allows you to find all the file paths in a directory that match a specific pattern. You can use the glob.glob() function to list all .txt files.

Here’s how you can do it:

import glob

directory = './your_folder/'
txt_files = glob.glob(f'{directory}*.txt')

print(txt_files)

This method imports the glob module, sets the target directory, and retrieves the list of text files using the glob.glob() function that filters file paths based on the given pattern (*.txt). The list of .txt files is then printed.

Method 3: os.listdir() and List Comprehension

The os.listdir() is a simple method to use when listing all files in a directory. You can iterate over all files obtain with this method using a simple list comprehension statement such as [file for file in os.listdir(dir_path) if file.endswith(".txt")].

See this example:

import os

dir_path = "your_directory_path"
all_files = os.listdir(dir_path)
txt_files = [file for file in all_files if file.endswith(".txt")]

print(txt_files)

This code will list all the text files in the specified directory using os.listdir function.πŸ“ƒ

Method 4: Using os.scandir()

The os.scandir() method can provide more information about each file. Extracting the files from this more information-rich representation is a bit less concise but works just fine in this list comprehension [entry.name for entry in os.scandir(dir_path) if entry.name.endswith(".txt") and entry.is_file()].

For instance, use the following code:

import os

dir_path = "your_directory_path"
txt_files = [entry.name for entry in os.scandir(dir_path) if entry.name.endswith(".txt") and entry.is_file()]

print(txt_files)

Method 5: Using glob.glob()

For a more concise solution, try the glob.glob() function from the glob library. Here’s the code snippet to list text files:

import glob

dir_path = "your_directory_path"
txt_files = glob.glob(f"{dir_path}/*.txt")

print(txt_files)

The glob.glob() function returns a list of all text files with the specified pattern (in this case, *.txt).✨

Method 6: Using pathlib.Path.iterdir()

Finally, the pathlib.Path.iterdir method offers another way to list text files in a directory. To use this method, simply import the pathlib library and write the following code:

from pathlib import Path

dir_path = Path("your_directory_path")
txt_files = [file.name for file in dir_path.iterdir() if file.is_file() and file.name.endswith(".txt")]

print(txt_files)

In this code, pathlib.Path.iterdir is iterator over the files in the directory and, when combined with list comprehensions, can efficiently list all text files.πŸŽ‰

Iterating Through Directories

In this section, you’ll learn how to iterate through directories using Python and get all the .txt files in a folder.

We’ll cover three methods: using the for loop method, working with the os.walk() function, and recursively traversing directories with a custom recursive function. πŸ“

Using the For Loop Method

To get started, we’ll use the os.listdir() function with a for loop. This approach allows you to iterate over all files in a directory and filter by their extension.

This code lists all the .txt files in the specified directory using a simple for loop. πŸ‘

import os

directory = 'your_directory_path'
for filename in os.listdir(directory):
    if filename.endswith('.txt'):
        print(os.path.join(directory, filename))

Working with the os.walk() Function

The os.walk() function is another powerful tool for iterating over files in directories. It enables you to traverse a directory tree and retrieve all files with a specific extension:

import os

root_dir = 'your_directory_path'
for root, dirs, files in os.walk(root_dir):
    for file in files:
        if file.endswith('.txt'):
            print(os.path.join(root, file))

This code explores the entire directory tree, including subdirectories, and prints out the full paths of .txt files. 🌳

In fact, we have written a detailed article with a video on the function, feel free to check it out! πŸ‘‡

πŸ§‘β€πŸ’» Recommended: Python os.walk() – A Simple Illustrated Guide

Recursively Traversing Directories with a Recursive Function

Lastly, you could create a custom recursive function to traverse directories and collect .txt files. This method is particularly useful when working with different operating systems, like Windows and Unix:

from pathlib import Path

def find_txt_files(path: Path):
    txt_files = []
    for item in path.iterdir():
        if item.is_dir():
            txt_files.extend(find_txt_files(item))
        elif item.name.endswith('.txt'):
            txt_files.append(item)
    return txt_files

directory = Path('your_directory_path')
txt_files = find_txt_files(directory)
print(txt_files)

This recursive function explores directories and subdirectories and returns a list of .txt files. This approach is more versatile as it leverages Python 3’s pathlib module. πŸ”

Filtering Based on File Extension and Size

To get all the .txt files in a folder, you can use the glob module in Python, which provides an easy way to find files matching a specific pattern.

Here’s a simple code snippet to get started:

import glob

txt_files = glob.glob('path/to/your/folder/*.txt')
print(txt_files)

This code will provide the absolute paths of all the .txt files within the specified folder. πŸ“

Now that you have the .txt files, you might want to filter them based on their size. To achieve this, you can use the os module.

Here’s an example of how to filter .txt files by size:

import os
import glob

min_size = 1000  # Replace with your desired minimum file size in bytes

txt_files = glob.glob('path/to/your/folder/*.txt')
filtered_files = [file for file in txt_files if os.path.getsize(file) >= min_size]

print(filtered_files)

In this code, min_size represents the minimum file size in bytes that you wish to retrieve. By using a list comprehension with a condition, you can filter out the files that don’t meet your size requirements. πŸ“

If you want to find .txt files not only in the target folder but also within its subdirectories, you can use the ** pattern along with the recursive parameter:

txt_files = glob.glob('path/to/your/folder/**/*.txt', recursive=True)

Using this approach, you can easily tailor your search to retrieve specific .txt files based on their size and location. With these tools at hand, you should be able to efficiently filter files in your Python projects. 🐍

Operating System Compatibility

Python works well across different operating systems, including Unix and Windows. Thanks to its compatibility 🀝, you can consistently use your code on different platforms. For this task, both the os and glob libraries are compatible with Unix and Windows systems, so you don’t have to worry about your text file retrieval code failing on either OS.

To get all the text files in a folder using Python, you can use the os and glob libraries. This works for all operating systems, i.e., Linux, Windows, Ubuntu, macOS.

Here’s a code snippet to achieve this:

import os
import glob

os.chdir("your_directory_path")
txt_files = glob.glob('*.txt')
print(txt_files)

Replace “your_directory_path” with the path of your folder containing the txt files.

Recommended Video