Easiest way to ignore blank lines when reading a file in Python
I have some code that reads a file of names and creates a list:
names_list = open("names", "r").read().splitlines()
Each name is separated by a newline, like so:
Allman
Atkinson
Behlendorf
I want to ignore any lines that contain only whitespace. I know I can do this by by creating a loop and checking each line I read and then adding it to a list if it's not blank.
I was just wondering if there was a more Pythonic way of doing it?
Solution 1:
I would stack generator expressions:
with open(filename) as f_in:
lines = (line.rstrip() for line in f_in) # All lines including the blank ones
lines = (line for line in lines if line) # Non-blank lines
Now, lines
is all of the non-blank lines. This will save you from having to call strip on the line twice. If you want a list of lines, then you can just do:
with open(filename) as f_in:
lines = (line.rstrip() for line in f_in)
lines = list(line for line in lines if line) # Non-blank lines in a list
You can also do it in a one-liner (exluding with
statement) but it's no more efficient and harder to read:
with open(filename) as f_in:
lines = list(line for line in (l.strip() for l in f_in) if line)
Update:
I agree that this is ugly because of the repetition of tokens. You could just write a generator if you prefer:
def nonblank_lines(f):
for l in f:
line = l.rstrip()
if line:
yield line
Then call it like:
with open(filename) as f_in:
for line in nonblank_lines(f_in):
# Stuff
update 2:
with open(filename) as f_in:
lines = filter(None, (line.rstrip() for line in f_in))
and on CPython (with deterministic reference counting)
lines = filter(None, (line.rstrip() for line in open(filename)))
In Python 2 use itertools.ifilter
if you want a generator and in Python 3, just pass the whole thing to list
if you want a list.
Solution 2:
You could use list comprehension:
with open("names", "r") as f:
names_list = [line.strip() for line in f if line.strip()]
Updated: Removed unnecessary readlines()
.
To avoid calling line.strip()
twice, you can use a generator:
names_list = [l for l in (line.strip() for line in f) if l]
Solution 3:
If you want you can just put what you had in a list comprehension:
names_list = [line for line in open("names.txt", "r").read().splitlines() if line]
or
all_lines = open("names.txt", "r").read().splitlines()
names_list = [name for name in all_lines if name]
splitlines() has already removed the line endings.
I don't think those are as clear as just looping explicitly though:
names_list = []
with open('names.txt', 'r') as _:
for line in _:
line = line.strip()
if line:
names_list.append(line)
Edit:
Although, filter looks quite readable and concise:
names_list = filter(None, open("names.txt", "r").read().splitlines())