filebucket configuration in Puppet

Solution 1:

Well, in a class that's included by all my nodes, I've got:

filebucket { puppet: server => "puppet.example.edu" }

The default in the File type is to backup to a local filebucket named "puppet". By changing the "puppet" filebucket to a server filebucket, you get server-based filebucket by default.

Alternately, if you want to preserve the option of overriding one specific file to use a local filebucket, you could do:

filebucket { main: server => "puppet.example.edu"; }
File { backup => main }

See https://puppet.com/docs/puppet/latest/types/filebucket.html for more details on options.

This accomplishes item #1 because it tells the nodes to all use the same single server for the filebucket. Item #3 comes along for free because it's still all going over an SSL-based connection with SSL-certificate verification.

Filebucket is mostly useful in case of recovery, which is likely to be the same day. In that case, look at the report and use the "filebucket" or "puppet filebucket" command to retrieve the original content based on the md5sum in the report.


Item #2 is where things gets tricky...

I prune it with a script like this:

find /var/lib/puppet/clientbucket/ -type f -mtime +45 -atime +45 -delete

That removes anything that's older than 45 days and hasn't been accessed at all in that time. The 45 days is based on our backup and backup retention policy, since it's long enough for a backup with a long retention to have happened and give us a theoretical 18 month recovery time.


What kind of parsing are you looking for? The bucket setup on the server is a hierarchy organized by md5sum, and inside a directory name matching the md5sum, there's "paths" to tell you which file and "contents" is the actual file. You need to look at the reports to see what system it came from.


I don't do any auditing. What kind of auditing are you looking for? That could mean many things.