How to force Logstash to reparse a file?

Solution 1:

By default logstash writes the position is last was on to a logfile which usually resides in $HOME/.sincedb. Logstash can be fooled into believing it never parsed the logfile by specifying /dev/null as sincedb_path.

Here the part of the documentation Input File.

Where to write the since database (keeps track of the current position of monitored log files). Defaults to the value of environment variable "$SINCEDB_PATH" or "$HOME/.sincedb".

Config Example

input {
    file {
        path => "/tmp/logfile_to_analyse"
        start_position => "beginning"
        sincedb_path => "/dev/null"
    }
}

Solution 2:

The plugin file store history of "tailing" in sincedb file, default : under $HOME/.sincedb* , see http://logstash.net/docs/1.3.3/inputs/file#sincedb_path

The since db file contains line look like :

[inode] [major device number] [minor device number] [byte offset]

So, if you want to parse again a complete file, you need to :

  • delete sindedb files
  • OR only delete the corresponding line in sincedb file, check the inode number before of your file (ls -i yourFile | awk '{print $1}' )
  • And restart Logstash

With the key start_position => "beginning", Logstash will analyze all the file.

Example of a sincedb file :

  • name : .sincedb_7a7413a84171aa550d5318c17fd756e9: the name contains sincedb_ and a MD5 (Digest::MD5.hexdigest) of all directory in key path (http://logstash.net/docs/1.3.3/inputs/file#path). See code of plugin file: https://github.com/logstash/logstash/blob/master/lib/logstash/inputs/file.rb#L105

Solution 3:

Logstash will keep the record in $HOME/.sincedb_*. You can delete all the .sincedb and restart logstash, Logstash will reparse the file.

Solution 4:

Combining all answers, guess this is the best way to parse files. I did the same for my testing.

input {
  file {
    path => "/tmp/access_log"
    start_position => beginning
    sincedb_path => "/dev/null"
    ignore_older => 0
  }
}

For a quick test, instead of ignore_older , you can also touch /tmp/access_log to change timestamp of the file.