Problem Formulation and Solution Overview
To follow along, save the contents below to a flat-text file called mona_lisa.txt
and move this file to the current working directory.
The Mona Lisa: A painting by Leonardo da Vinci Leonardo da Vinci began painting the Mona Lisa about 1503, which was in his studio when he died in 1519. He worked on it intermittently over several years, adding multiple layers of thin oil glazes at different times. Reference: https://www.britannica.com/topic/Mona-Lisa-painting |
Method 1: Use open() and len()
This method uses three (3) functions, open()
, len()
and readlines()
to retrieve the file’s line count. Ideal for reasonably sized files as it reads in all lines at once.
with open('mona_lisa.txt', 'r') as fp: line_count = len(fp.readlines()) print(line_count)
Above opens the file mona_lisa.txt
in reading (r
) mode, creating a File Object (similar to below). This object is assigned to fp
, allowing access to and manipulation of the stated file.
<_io.TextIOWrapper name='mona_lisa.txt' mode='r' encoding='cp1252'> |
The next line does the following:
- Opens and reads in the contents of the stated flat-text file (
readlines()
). - Passes the above code as an argument to the
len()
function, which calculates the file’s line count (including blank lines). - The results are saved to
line_count
.
Then, line_count
is output to the terminal.
4 |
Method 2: Use sum()
This method uses the sum()
function. This function takes two (2) arguments: an iterable (required) and a start position (optional).
line_count = sum(1 for x in open('mona_lisa.txt', 'r')) print(line_count)
The above code snippet calls the sum()
function and passes an argument that opens the mona_list.txt
file in read (r
) mode.
Then it loops through each line and increases sum()
by one (1) (including blank lines). The results are saved to line_count
.
Then, line_count
is output to the terminal.
4 |
Method 3: Use read() and split()
This method uses open()
, read()
, split()
and len()
to determine a file’s line count. Not as efficient as other solutions but gets the job done.
with open('mona_lisa.txt', 'r') as fp: all_lines = fp.read() line_count = len(all_lines.split('\n')) print(line_count)
Above opens the mona_list.txt
file in read (r
) mode. Then, read()
is called in, with no argument. The results save to all_lines
.
π‘Note: Passing no argument into read()
means to read in the entire file (including blank lines).
Next, the contents of all_lines
are split on the newline character (\n
), and the results (total number of lines) save to line_count
.
Then, line_count
is output to the terminal.
4 |
Method 4: Use List Comprehension
This method uses List Comprehension
and len()
to retrieve the file’s line count while ignoring blank lines.
lines = [x for x in open('mona_lisa.txt') if len(x) > 1] print(len(lines))
Above opens the file mona_lisa.txt
in read (r
) mode. Then each line is examined, and if the line length exceeds one (1), it is appended to lines
.
π‘Note: The code (if len(x) > 1
) checks to see if the line in question contains data. If a newline is encountered (\n
), it resolves to a length of one (1) and is not appended.
The contents of lines
display below.
['The Mona Lisa: A painting by Leonardo da Vinci\n', 'Leonardo da Vinci began painting the Mona Lisa about 1503, which was in his studio when he died in 1519. He worked on it intermittently over several years, adding multiple layers of thin oil glazes at different times. \n', 'Reference: https://www.britannica.com/topic/Mona-Lisa-painting'] |
Then,
is output to the terminal.line_count
3 |
Method 5: Use List Comprehension and a Generator
This method uses Use List Comprehension
and a Generator
to retrieve the file’s line count.
with open('mona_lisa.txt') as fp: line_count = [ln for ln in (line.strip() for line in fp) if ln] print(len(line_count))
Above opens the file mona_lisa.txt
in read (r
) mode, creating a File Object (similar to below). This object is assigned to fp
, allowing access to and manipulation of the stated file.
<_io.TextIOWrapper name='mona_lisa.txt' mode='r' encoding='cp1252'> |
List Comprehension
is used to loop through each line in the file while the Generator
strips any leading or trailing spaces from the line. If the line still contains data, it is appended to line_count
.
Next, the length of line_count
is determined (len(line_count)
) and output to the terminal.
3 |
Bonus: Use NumPy loadtxt()
What if you needed to determine the line count from a file containing floating-point numbers? You could use NumPy’s loadtxt()
function.
The contents of the flat-text file nums.txt.
110.90 146.03 |
import numpy as np data = np.loadtxt('nums.txt') print(len(data))
The first line imports the NumPy library. Click here if this library requires installation.
Then, nums.txt
is read using NumPy’s loadtxt()
function. The contents are saved to data
as follows.
[[110.9 146.03] |
Then, len(data)
is called to determine the file’s line count
and output to the terminal.
5 |
Summary
Programmer Humor
π±ββοΈ Programmer 1: We have a problem
π§ββοΈ Programmer 2: Letβs use RegEx!
π±ββοΈ Programmer 1: Now we have two problems
… yet – you can easily reduce the two problems to zero as you polish your “RegEx Superpower in Python“. π