5 Best Ways to Find Line Number of a Given Word in a Text File Using Python

πŸ’‘ Problem Formulation: You have a text file and need to find the line number where a specific word occurs. For example, if your text file contains a poem and you’re searching for the word “dawn,” your Python code should return the line numbers where “dawn” appears. Ideally, it would handle multiple occurrences of the word and potentially ignore case differences.

Method 1: Using a Simple Loop with the enumerate Function

The first method employs a fundamental loop structure coupled with the enumerate() function, which is used for keeping a count while iterating over the lines of the file. This approach is straightforward and easy to understand, making it perfect for beginners. The function specification for this method is simple: read the file line by line, increment a counter, and compare each line with the target word.

Here’s an example:

word_to_find = "dawn"
line_numbers = []
with open('poem.txt', 'r') as file:
    for line_number, line in enumerate(file, 1):
        if word_to_find in line.strip().split():
            line_numbers.append(line_number)

print(line_numbers)

Output:

[2, 4, 20]

In this snippet, the file 'poem.txt' is opened in read mode, and the enumerate() function is used to provide a line number to each line in the file. Lines are stripped of whitespace and then split into words. If the word to find matches any word on a line, the line number is added to the list line_numbers.

Method 2: Using the readlines() Method

This method uses the readlines() method to obtain all lines in the text file and store them in a list. The list is then iterated with enumeration to find the line numbers that contain the target word. It is suitable for files that are not overwhelmingly large, as all lines are loaded into memory at once.

Here’s an example:

word_to_find = "dawn"
line_numbers = []
with open('poem.txt', 'r') as file:
    lines = file.readlines()
    for line_number, line in enumerate(lines, 1):
        if word_to_find in line.strip().split():
            line_numbers.append(line_number)

print(line_numbers)

Output:

[2, 4, 20]

After reading all lines into a list with readlines(), the code loops over the list and checks each line for the target word, similar to Method 1. Matching line numbers are collected in the line_numbers list, which is finally printed out.

Method 3: Using a Regular Expression Search

The third method leverages Python’s re module to search for a specific word within each line using regular expressions. This is powerful as it allows for complex search patterns and word-boundary matching which reduces false positives.

Here’s an example:

import re

word_to_find = "dawn"
pattern = re.compile(r'\\b' + word_to_find + r'\\b')
line_numbers = []
with open('poem.txt', 'r') as file:
    for line_number, line in enumerate(file, 1):
        if pattern.search(line):
            line_numbers.append(line_number)

print(line_numbers)

Output:

[2, 4, 20]

The code compiles a regular expression that matches the word as a whole word within a line, reducing the chance of matching substrings within other words. It then iterates over each line of the file, adding the line number to the line_numbers list when a match is found.

Method 4: Using a List Comprehension

A more Pythonic approach, list comprehension provides a succinct way to achieve the same result as Method 1 but in a single line of code. This is a more advanced technique and demonstrates the power of Python’s expressive syntax.

Here’s an example:

word_to_find = "dawn"
with open('poem.txt', 'r') as file:
    line_numbers = [line_number for line_number, line in enumerate(file, 1) if word_to_find in line.strip().split()]

print(line_numbers)

Output:

[2, 4, 20]

In this example, the list comprehension iterates over the file, strips and splits each line, and evaluates whether the target word is in the line, all in a single, readable line of code. The line numbers are collected in the line_numbers list.

Bonus One-Liner Method 5: Using the filter Function with a lambda

The final method also results in a one-liner, but this time using the filter() function and a lambda expression to filter the lines containing the desired word. This is an elegant, functional programming approach to the problem.

Here’s an example:

word_to_find = "dawn"
with open('poem.txt', 'r') as file:
    line_numbers = list(filter(lambda x: word_to_find in x[1].strip().split(), enumerate(file, 1)))

print([number for number, _ in line_numbers])

Output:

[2, 4, 20]

In this compact snippet, filter() is used to create a filtered enumeration of lines containing the word, and a list comprehension extracts the line numbers.

Summary/Discussion

Method 1: Using a Simple Loop with the enumerate Function. This method is easily understood by beginners. However, it’s not the most concise or efficient for large files.

Method 2: Using the readlines() Method. Suitable for smaller files and very straightforward. Inefficient for large files because it loads all lines into memory at once.

Method 3: Using a Regular Expression Search. Powerful and robust, it allows for accurate word-boundary matching. More complex and slightly less readable than simpler methods.

Method 4: Using a List Comprehension. It provides an elegant, Pythonic solution in a single line of code. However, it can be a bit less readable for those not well-versed in Python syntax.

Bonus Method 5: Using the filter Function with a lambda. A functional approach that’s clean and concise. May be difficult for beginners to parse and understand.