Munin aggregate graphs are not working
I know this has been asked in several times on many forums before, but still I am struck with similar problem.
Individual graphs are working fine however, aggregate graphs are not. I don't even get an empty graph (graph without data).
All the machines are running on Ubuntu-12.04 m1.medium ec2 instance. Munin version is 1.4.6.
My munin.conf looks like...
[localhost.localdomain]
address 127.0.0.1
use_node_name yes[.us-west-1.compute.internal]
address
use_node_name yes[.us-west-1.compute.internal]
address
use_node_name yes[.us-west-1.compute.internal]
address
use_node_name yes[us-west-1.compute.internal;totalcheckpoints]
update no
contacts nopostgres_checkpoints_checkpoints_req.update no postgres_checkpoints_checkpoints_req.graph yes postgres_checkpoints_checkpoints_req.graph_args --base 1000 -l 0 postgres_checkpoints_checkpoints_req.cdef 0 postgres_checkpoints_checkpoints_req.graph_category PG Total Checkpoints postgres_checkpoints_checkpoints_req.graph_title Aggregated checkpoints postgres_checkpoints_checkpoints_req.graph_vlabel Total Checkpoints postgres_checkpoints_checkpoints_req.checkpoints_req_total.label Total checkpoints postgres_checkpoints_checkpoints_req.graph_order checkpoints_req_total postgres_checkpoints_checkpoints_req.checkpoints_req_total.sum \ <internal_ip>.us-west-1.compute.internal:postgres_checkpoints_<internal_ip>.us-west-1.compute.internal_checkpoints_req.checkpoints_req \ <internal_ip>.us-west-1.compute.internal:postgres_checkpoints_<internal_ip>.us-west-1.compute.internal_checkpoints_req.checkpoints_req \ <internal_ip>.us-west-1.compute.internal:postgres_checkpoints_<internal_ip>.us-west-1.compute.internal_checkpoints_req.checkpoints_req
I have tried following symblinks in /etc/munin/plugins:
postgres_checkpoints -> /usr/share/munin/plugins/postgres_checkpoints
postgres_checkpoints_ -> /usr/share/munin/plugins/postgres_checkpoints
postgres_checkpoints__ -> /usr/share/munin/plugins/postgres_checkpoints
As munin user following munin commands are working fine and I don't see anything obviously wrong in the output:
sudo su - munin -s /bin/bash
/usr/share/munin/munin-update --debug --nofork
/usr/share/munin/munin-graph --debug --nofork --nolazy
/usr/share/munin/munin-html --debug
telnet returns correct info for plugin postgres_checkpoints:
munin@hostname:~$ telnet 4949
Trying ...
Connected to .
Escape character is '^]'.
# munin node at internal-ip-of-munin-node.us-west-1.compute.internal
config postgres_checkpoints
graph_title PostgreSQL checkpoints
graph_vlabel Checkpoints / minute
graph_category PostgreSQL
graph_info Number of checkpoints per minute
graph_args --base 1000
graph_period minute checkpoints_timed.label Timed checkpoints
checkpoints_timed.info Checkpoints started by timeout
checkpoints_timed.type DERIVE
checkpoints_timed.draw LINE1
checkpoints_req.label Requested
checkpoints
checkpoints_req.info Checkpoints started by request
checkpoints_req.type DERIVE
checkpoints_req.draw STACK
.
fetch postgres_checkpoints
checkpoints_timed.value 2860
checkpoints_req.value 37
.
quit
Logs on munin-master and munin-node do not indicate any obvious errors. Also have verified that everywhere all hostanames are correct fqdn.
Any ideas what am I missing?
I have checked many forums and links. However serverfault is not allowing me to paste more than two links I referred:
1. http://munin-monitoring.org/wiki/aggregate_examples
2. http://blog.loftninjas.org/2010/04/08/an-evening-with-munin-graph-aggregation/
Thanks for attention.
Solution 1:
Finally I got it working. Munin is not that bad, all you need is to spend a couple of nights with it.
I misunderstood the documentation, you need not to mention hostname. Plugin name should be exactly same as on munin nodes. Also the same plugin should exist on Munin-master with __.
So, in /etc/munin/plugins now symblinks looks like:
postgres_checkpoints__ -> /usr/share/munin/plugins/postgres_checkpoints
And here is the new configuration, note the plugin-name after ":" doesn't have hostname in it:
postgres_checkpoints_total.update no pg_checkpoints.label Graph label postgres_checkpoints_total.graph yes postgres_checkpoints_total.graph_args --base 1000 -l 0 postgres_checkpoints_total.cdef 0 postgres_checkpoints_total.graph_category PG Total Checkpoints postgres_checkpoints_total.graph_title Aggregated checkpoints postgres_checkpoints_total.graph_vlabel Total Checkpoints postgres_checkpoints_total.checkpoints_req_total.label Total Req checkpoints postgres_checkpoints_total.checkpoints_timed_total.label Total Timed checkpoints postgres_checkpoints_total.graph_order checkpoints_req_total checkpoints_timed postgres_checkpoints_total.checkpoints_req_total.sum \ <internal_ip>.us-est-1.compute.internal:postgres_checkpoints.checkpoints_req \ <internal_ip>.us-west-1.compute.internal:postgres_checkpoints.checkpoints_req \ <internal_ip>.us-west-1.compute.internal:postgres_checkpoints.checkpoints_req postgres_checkpoints_total.checkpoints_timed_total.sum \ <internal_ip>.us-west-.compute.internal:postgres_checkpoints.checkpoints_timed \ <internal_ip>.us-west-1.compute.internal:postgres_checkpoints.checkpoints_timed \ <internal_ip>.us-west-1.compute.internal:postgres_checkpoints.checkpoints_timed
Also, please note that now in the above configuration I am aggregating 2 functions.