Remove first N lines from an active log file
Is there a way to remove the first N
lines from a log that is being actively appended by an application?
No, operating systems like Linux, and it's filesystems, don't make provision for removing data from the start of a file. In other words, the start point of storage for a file is fixed.
Removing lines from the start of a file is usually accomplished by writing the remaining data to a new file and deleting the old. If a program has the old file open for writing, that file's deletion is postponed until the application closes the file.
As commenters noted, for the reasons given in my previous sentence, you usually need to coordinate logfile pruning with the programs that are writing the logs. Exactly how you do this depends on the programs. Some programs will close and reopen their logfiles when you send them a signal (e.g. HUP) and this can be used to prevent log records being written to a 'deleted' logfile, without disrupting service.
There are many utilities available for managing the size of log files, for example logrotate
Some programs have their own utilities. For example, the Apache webserver includes a rotatelogs utility.
I think this task can be achieved with sed
sed -i '1,10d' myfile
would remove the lines from 1st to the 10th line form the file.
I think everybody should at least have a look at this sed 1 liners.
Note that this does not work for logfiles that are being actively appended to by an application (as stated in the question).
sed -i
will create a new file and 'delete' the file that is being written to. Most applications will continue to write log records to the deleted log file and will continue to fill disk space. The new, truncated, log file will not be appended to. This will only cease when the application is restarted or is otherwise signalled to close and reopen its log files. At which point there will be a gap (missing log records) in the new log file if there has been any loggable activity between the use of sed and the application restart.
A safe way to do this would be to halt the application, use sed to truncate the log, then restart the application. This approach can be unacceptable for some services (e.g. a web-server with high throughput and high service-continuity requirements)
No. A solution to this generic problem of log file growth is log rotation. This involves the regular (nightly or weekly, typically) moving of an existing log file to some other file name and starting fresh with an empty log file. After a period the old log files get thrown away.
See: http://www-uxsup.csx.cam.ac.uk/~jw35/courses/apache/html/x1670.htm
This is an answer, not a solution. There is NO solution to the question. The asker clearly states: "from a log that is being actively appended by an application". You can read on to understand more, and skip to the end for a suggestion I make based on my presumption why this code isn't following logging best practices.
To be clear: other "answers" here offer the false promise. No amount of renaming will trick the application into using the new file. The most useful information is buried in the comments made to these incorrect answers.
ACTIVE files are not some kind of container you simply put data into. A filename points to ONE inode (start of the file) and every inode has a pointer to another inode (if there is more data). That means a continually written-to file has a constant stream of inodes being added to it, and what you think of a "file" is actually a log sequence of inodes.
Imagine you were tracking someone on Google Maps, and that person could teleport anywhere in the world, at any time, and you were trying to connect these dots.
The Linux tool "truncate" can discard data at the end of the file, by simply walking the inode tree and (at the location/size you designate) it will discard all subsequent pointers in the stack. To do the reverse - discard data at the start of the file - would be a such horribly complex and risky process of rewriting the inode tree in real-time that nobody will write such tools for the public, because they would often fail and lead to data loss. The Inodes wiki is short but explains some of these concepts.
Back to your problem: This is probably an internal application (else someone would already have contributed a patch to fix). Flag this behavior for Code Review because this isn't following logging best practices. Explore possible impacts.. are you desperately trying to prevent an outage due to disk full? That should be a scenario documented somewhere in review, as a Risk.