Monitoring MongoDB 3.2 using Stackdriver in Google Compute Engine failed silently

I'm having problem monitoring MongoDB 3.2 with Stackdriver as of 28 Aug 2016.

There is no mention of mongo whatsoever in /var/log/syslog but if I made a configuration error on the .conf file, it complains so I know it's loading the file correctly...

So no errors, but no mention of mongo either in /var/log/syslog and https://app.google.stackdriver.com/services/mongodb claims I haven't installed the agent.

gke-fatih-standard-fb894cbb-d7ue:/opt/stackdriver/collectd/etc$ sudo service stackdriver-agent restart
[....] Restarting Stackdriver metrics collection agent: stackdriver-agentoption = Interval; value = 60.000000;
Created new plugin context.
option = Interval; value = 60.000000;
Created new plugin context.
option = PIDFile; value = /var/run/stackdriver-agent.pid;
option = Interval; value = 60.000000;
Created new plugin context.
. ok

$ tail -F /var/log/syslog
Aug 28 06:53:01 gke-fatih-standard-fb894cbb-d7ue /USR/SBIN/CRON[21824]: (root) CMD (/etc/supervisor/supervisor_watcher.sh 2>&1 | logger)
Aug 28 06:53:03 gke-fatih-standard-fb894cbb-d7ue collectd[21844]: type = syslog, key = LogLevel, value = info
Aug 28 06:53:03 gke-fatih-standard-fb894cbb-d7ue collectd[21844]: write_gcm: inside module_register for stackdriver_agent/5.5.0-340.wheezy
Aug 28 06:53:03 gke-fatih-standard-fb894cbb-d7ue collectd[21845]: type = syslog, key = LogLevel, value = info
Aug 28 06:53:03 gke-fatih-standard-fb894cbb-d7ue collectd[21845]: write_gcm: inside module_register for stackdriver_agent/5.5.0-340.wheezy
Aug 28 06:53:03 gke-fatih-standard-fb894cbb-d7ue collectd[21846]: Initialization complete, entering read-loop.
Aug 28 06:53:03 gke-fatih-standard-fb894cbb-d7ue collectd[21846]: match_throttle_metadata_keys: 1 history entries, 1 distinct keys, 78 bytes server memory.
Aug 28 06:53:03 gke-fatih-standard-fb894cbb-d7ue collectd[21846]: tcpconns plugin: Reading from netlink succeeded. Will use the netlink method from now on.
Aug 28 06:53:03 gke-fatih-standard-fb894cbb-d7ue collectd[21846]: write_gcm: Asking metadata server for auth token
Aug 28 06:53:04 gke-fatih-standard-fb894cbb-d7ue collectd[21846]: match_throttle_metadata_keys: 2 history entries, 1025 distinct keys, 102801 bytes server memory.

Note that instance/node is monitored correctly, only MongoDB is problematic.

/opt/stackdriver/collectd/etc/collect.d/mongo0.conf :

# scheduled to node: gke-fatih-standard-fb894cbb-d7ue
# This is the monitoring configuration for MongoDB.
# Look for STATS_USER, STATS_PASS, MONGODB_HOST and MONGODB_PORT to adjust your configuration file.
LoadPlugin mongodb
<Plugin "mongodb">
    # When using non-standard MongoDB configurations, replace the below with
    #Host "MONGODB_HOST"
    #Port "MONGODB_PORT"
    # Must use the load balancer because we don't know the fixed nodePort
    Host "xxx"
    Port "27017"

    # If you restricted access to the database, you can set the username and
    # password here:
    User "stats"
    Password "xxx"
</Plugin>

Related to Monitoring MongoDB 3 using StackDriver in GCE


Solution 1:

Google is deprecating their non-GCP focused Stackdriver integrations (like Mongo) and moving to the BindPlane MIaaS platform as their supported monitoring integrations platform for non-GCP datasources.

More details can be found here:

https://cloud.google.com/monitoring/agent/plugins/bindplane-transition

and here:

https://bluemedora.com/how-to-monitor-mongodb-bindplane-for-stackdriver-blue-medora/

Solution 2:

After doing sudo service stackdriver-agent restart again (which I have done before) and perhaps ~30 minutes of original incident, the metrics are now detected by Stackdriver.

So if you're sure you've done everything right and no errors, you may try restarting stackdriver-agent multiple times and waiting ~30 minutes.

The lack of anything mongo-related in /var/log/syslog is an issue. Which I hope @Corey-Kosak can give more information.