Collecting and centralizing information from many servers, sort of like Puppet's facter

We manage hundreds of RedHat Enterprise Linux servers happily with Puppet. One of the cool side effects is that we can go to /var/lib/puppet/yaml/facts and look at the output of the "facter" utility (part of Puppet).

Now I would like the same kind of convenience for more information, such as which services are up and running or deactivated, or the list of packages installed. I'm not quite talking about monitoring, since I'm not so much interested in generating alerts or graphs on this, but more on having the information centralized for analysis.

I see two parts to doing this:

  • first a mechanism for connecting the central repository to the clients. I remember that net-snmp already exposes the RPM database if allowed to do so, I guess it might or might be made to expose chkconfig.

  • second a tool to store said information.

Which tool could help with this? I'm looking for something that stores data in a convenient way, either SQL, YAML, XML or consistently formatted text files, and can be easily told who to talk to.


You may also want to checkout mcollective, it's been taken on by puppetlabs as an official project. It does orchestration and allows real time querying of your systems.

There are various screen casts and an simple plugin mechanism, it makes ad-hoc questions easy and uses puppet's RAL and facter, but other plugins are available and you can write your own. The screencast shows it in action.


For installed packages, net-snmp is probably the best way to go.

If you want a good interface for puppet facts, you can try Foreman. The git version also has a REST api what you can use in scripts.

You could possibly write custom puppet facts and access them via foreman.