Reading from a frequently updated file
I'm currently writing a program in python on a Linux system. The objective is to read a log file and execute a bash command upon finding a particular string. The log file is being constantly written to by another program.
My question: If I open the file using the open()
method will my Python file object be updated as the actual file gets written to by the other program or will I have to reopen the file at timed intervals?
UPDATE: Thanks for answers so far. I perhaps should have mentioned that the file is being written to by a Java EE app so I have no control over when data gets written to it. I've currently got a program that reopens the file every 10 seconds and tries to read from the byte position in the file that it last read up to. For the moment it just prints out the string that's returned. I was hoping that the file did not need to be reopened but the read command would somehow have access to the data written to the file by the Java app.
#!/usr/bin/python
import time
fileBytePos = 0
while True:
inFile = open('./server.log','r')
inFile.seek(fileBytePos)
data = inFile.read()
print data
fileBytePos = inFile.tell()
print fileBytePos
inFile.close()
time.sleep(10)
Thanks for the tips on pyinotify and generators. I'm going to have a look at these for a nicer solution.
I would recommend looking at David Beazley's Generator Tricks for Python, especially Part 5: Processing Infinite Data. It will handle the Python equivalent of a tail -f logfile
command in real-time.
# follow.py
#
# Follow a file like tail -f.
import time
def follow(thefile):
thefile.seek(0,2)
while True:
line = thefile.readline()
if not line:
time.sleep(0.1)
continue
yield line
if __name__ == '__main__':
logfile = open("run/foo/access-log","r")
loglines = follow(logfile)
for line in loglines:
print line,
"An interactive session is worth 1000 words"
>>> f1 = open("bla.txt", "wt")
>>> f2 = open("bla.txt", "rt")
>>> f1.write("bleh")
>>> f2.read()
''
>>> f1.flush()
>>> f2.read()
'bleh'
>>> f1.write("blargh")
>>> f1.flush()
>>> f2.read()
'blargh'
In other words - yes, a single "open" will do.
Here is a slightly modified version of Jeff Bauer answer which is resistant to file truncation. Very useful if your file is being processed by logrotate
.
import os
import time
def follow(name):
current = open(name, "r")
curino = os.fstat(current.fileno()).st_ino
while True:
while True:
line = current.readline()
if not line:
break
yield line
try:
if os.stat(name).st_ino != curino:
new = open(name, "r")
current.close()
current = new
curino = os.fstat(current.fileno()).st_ino
continue
except IOError:
pass
time.sleep(1)
if __name__ == '__main__':
fname = "test.log"
for l in follow(fname):
print "LINE: {}".format(l)