A newline is used to mark the end of a line and the beginning of a new one, and in this article we are going to look at how to read a file in Python without these newline breaks.
To begin with we are going to create a simple
.txt file. After each line of text, the enter/return key on the keyboard is pressed – creating a newline in the file itself. This is shown, for illustration purposes only, explicitly with the words (return) below which we are saving in a file called
newline.txt:Hello(return) my(return) name(return) is(return) Rikesh.(return)
Background: Opening and Reading a File
Now that we have our basic
.txt file, let’s start by seeing what happens when we open and read the file. When we open a file in Python we can read it by passing the
'r' parameter in our open statement. The simplest method of opening and reading a file is as follows:
file= open("newline.txt", "r") file.read() # 'Hello\nmy\nname\nis\nRikesh.\n'
Using this method we can see that the newline is being read by Python and represented by the
‘\n’ character. This
\n is the Python special character for a newline.
A much cleaner way of opening files in Python is using the
‘with open’ statement as this will automatically close the file once finished. We are going to keep reading the file using the
‘r’ parameter and will run a
with open("newline.txt", "r") as file: line = file.read() print(file)
Hello my name is Rikesh.
Whilst it may appear different from the previous example, by using our print statement we have just asked Python to implement the newline code. Effectively, our output looks like this:
Hello\nmy\n name\n is\n Rikesh.\n
Method 1: Splitting with splitlines() and split(‘\n’)
with open("newline.txt", "r") as file: line=file.read().splitlines() print(line) # ['Hello', 'my', 'name', 'is', 'Rikesh.']
split() Python method effectively does the same thing, but we can specify the separator, i.e., at which point we wish the split to take place. In our example it would be at the
\n character, which as we saw is the Python representation of a newline:
with open("newline.txt", "r") as file: line=file.read().split("\n") print(line) # ['Hello', 'my', 'name', 'is', 'Rikesh.']
Whilst both of these methods remove the newlines, by default each of our original lines of text has been returned as a separate item in a list. This, obviously, has limited functionality unless our initial file contained individual string items we wanted to keep separate in the first place — for example, a list of numbers. In our example, with a pure text-only file the output is less useful.
Method 2: Stripping with strip() and rstrip()
In Python the
strip() method is used to remove spaces at the beginning (leading) and the end (trailing) of a string. By default, this not only includes white spaces but newline characters as well. This would be better illustrated with some small changes to our original file:
newline_space.txt: Hello (return) my (return) name (return) is(return) Rikesh. (return)
Although the actual text is the same we have added some whitespaces before and after our text entries. The final thing to note with this method is that, as it works through our file on a string by string basis, we need to iterate over our file to ensure
strip() applied to each string:
with open("newline_space.txt", "r") as file: newline_breaks="" for line in file: stripped_line = line.strip() newline_breaks += stripped_line print(newline_breaks) # HellomynameisRikesh.
As we can see, the
strip() method has not only got rid of the newline but all the leading and trailing whitespaces as well. Whilst this can be a useful feature, what if we wanted to keep the whitespaces and just get rid of the newline character? Well, we can do this by passing
\n as the parameter in our
split() method :
with open("newline_space.txt", "r") as file: newline_breaks="" for line in file: stripped_line = line.strip(‘\n’) newline_breaks += stripped_line print(newline_breaks) # Hello my name is Rikesh.
strip() method affects both trailing and leading spaces we can use
rstrip() to remove only the trailing characters i.e., those at the end of the string. As newline breaks tend to be at the end of a string, this method is preferred to
lstrip() which only affects characters at the beginning of the string. Once again, we can pass the
\n parameter to ensure we only remove the newline characters:
with open("newline_space.txt", "r") as file: newline_breaks="" for line in file: stripped_line = line.rstrip(‘\n’) newline_breaks += stripped_line print(newline_breaks) Hello my name is Rikesh.
Method 3: Slicing
Another way to remove the newline is by slicing, but it should be noted this should be used with extreme caution as it is less targeted than our other methods. With slicing we can ask Python to remove the last character of each string, through negative slicing
[:-1]. As with
strip() we have to iterate over our file:
with open("newline.txt", "r") as file: newline_breaks="" for line in file: stripped_line = line[:-1] newline_breaks += stripped_line print(newline_breaks) # HellomynameisRikesh.
However, please bear in mind that slicing is indiscriminate – it will not care what the last character is and we can not specify this. So, although it works when our original file is consistent and has all the newline breaks in the right places, what happens if that’s not the case? Let’s change our original file to make it less consistent, and more like the kind of real-world file we are likely to be dealing with:
newline_slice.txt:Hello(return) my(return) name(return) is(return) Rikesh
In this file, the full stop and return at the end of the last line have been removed, so the last character for that script is
‘h’. It is important to note, there are no whitespaces or returns after this character. Now, if we try slicing this file:
with open("newline_slice.txt", "r") as file: newline_breaks="" for line in file: stripped_line = line[:-1] newline_breaks += stripped_line print(newline_breaks) # HellomynameisRikes
The output has sliced (chopped-off) the last character of my name. We therefore need to be sure of the integrity and formatting of our original file before we can use this method, otherwise we risk losing data.
Method 4: Replace
The final method we are going to look at is
replace(). As the name suggests, we can use this to replace a specific phrase in our string with another specific phrase. As we would expect the newline break to be used when there is some kind of logical break in our text, an obvious choice would be replacing it with a whitespace, which we can do with
" ". This method, also requires us to iterate over our file:
with open("newline.txt", "r") as file: newline_breaks = "" for line in file: stripped_line = line.replace('\n', " ") newline_breaks += stripped_line print(newline_breaks) # Hello my name is Rikesh.
Whilst this has given us the most cleanly formatted of all our examples, this is only because of the formatting of our original file. However,
replace() does have the flexibility to allow the newline characters to be replaced with whatever is most appropriate for our particular file.
We have seen a number of different methods for reading a file without newlines. Each one is effective in it’s own way and does the job of removing our newlines, so there is no right answer. The correct method will depend on the original file we are working from, both in terms of content (plain text, integers) and formatting (whitespaces, consistency).
If you need to keep items separated, splitting might be the best option. If you need to concatenate the output replacing or stripping could be the answer. Opening and reading the file first, without any formatting to assess the file content and structure, would be the most important step in deciding which method most suits your needs.