Graphite shows "None" for all data points even though I send it data

I have installed Graphite via Puppet (https://forge.puppetlabs.com/dwerder/graphite) with nginx and PostgresSQL. When I send it data manually, it creates the metric but all its data points are "None" (a.k.a. null). This happens also if I run the example-client.py shipped with Graphite.

echo "jakub.test 42 $(date +%s)" | nc 0.0.0.0 2003 # Carbon listens at 2003
# A minute or so later:
$ whisper-fetch.py --pretty /opt/graphite/storage/whisper/jakub/test.wsp | head -n1
Sun May  4 12:19:00 2014    None
$ whisper-fetch.py --pretty /opt/graphite/storage/whisper/jakub/test.wsp | tail -n1
Mon May  5 12:09:00 2014    None
$ whisper-fetch.py --pretty /opt/graphite/storage/whisper/jakub/test.wsp | grep -v None | wc -l
0

And:

$ python /opt/graphite/examples/example-client.py 
# Wait until it sends two batches of data ...
$ whisper-fetch.py /opt/graphite/storage/whisper/system/loadavg_15min.wsp | grep -v None | wc -l
0

This is, according to ngrep, the data that arrives to the port [from a later attempt] (line 3):

####
T 127.0.0.1:34696 -> 127.0.0.1:2003 [AP]
  jakub.test  45 1399362193. 
####^Cexit
23 received, 0 dropped

This is the relevant part of /opt/graphite/conf/storage-schemas.conf:

[default]
pattern = .*
retentions = 1s:30m,1m:1d,5m:2y

Any idea what is wrong? Carbon's own metrics and data are displayed in the UI. Thank you!

Environment: Ubuntu 13.10 Saucy, graphite 0.9.12 (via pip).

PS: I have written about my troubleshooting attempts here - Graphite Shows Metrics But No Data – Troubleshooting

UPDATE:

  1. Data points in whisper files are only recored every 1m min even if retention policy specifies a higher precision such as "1s" or "10s".
  2. Workaround for data being ignored: Either use an aggregation schema with xFilesFactor = 0.1 (instead of 0.5) or set the lowest precision to 1m instead of <number between 1-49>s. - see the comments below the accepted answer or the Graphite Answers qustion. According to the docs: "xFilesFactor should be a floating point number between 0 and 1, and specifies what fraction of the previous retention level’s slots must have non-null values in order to aggregate to a non-null value. The default is 0.5." So it seems that without regard for having specified precision of 1s, the data gets aggregated to 1 minute and ends up being None because less than 50% of values in the minute period are non-None.

SOLUTION

So @jlawrie lead me to the solution. It turns out the data are actually there but are aggregated to nothing, the reason is double:

  1. Both the UI and whisper-fetch show data aggregated to the highest precision that spans the whole query period, which defaults to 24h. I.e. anything with retention < 1d will never show in the UI or fetch unless you select a shorter period. Since my retention period for 1s was 30min, I'd need to select period of <= last 30 min to actually see the raw data at the highest precision being collected.
  2. When aggregating data (from 1s to 1min in my case), Graphite requires by default that 50% (xFilesFactor = 0.5) of data points in the period have value. If not, it will ignore the existing values and aggregate it to None. So in my case I'd need to send data at least 30 times within a minute (30 is 50% of 60s = 1min) for them to show up in the aggregated 1-min value. But my app only sends data every 10s so I only have 6 out of the possible 60 values.

=> solution is to change the first precision from 1s to 10s and remember to select a shorter period when I want to see the raw data (or extend its retention to 24h to show it by default).


I encountered the same issue using that same puppet module. I'm not exactly sure why, but changing the default retention policy appears to fix it, e.g.

class { 'graphite':
  gr_storage_schemas => [
    {
      name       => 'carbon',
      pattern    => '^carbon\.',
      retentions => '1m:90d'
    },
    {
      name       => 'default',
      pattern    => '.*',
      retentions => '1m:14d'
    }
  ],
}

There are many ways that Graphite will loose data, which is why I really try to avoid using it. Let me start with a simple one - try having your application connect, wait a second (literally one second) and then output the timestamped data. I've found in many circumstances this will fix that exact problem. Another thing you should try is submitting data at a frequency that is much higher than the frequency at which graphite logs data. I'll go into that a bit more. Another frequent mistake is using the whisper-resize.py utility, which really didn't work for me. If your data isn't important yet, just delete the whisper files and let them get created with the new retention settings.

Graphite's storage files, the whisper files, instead of storing the data as a point with a value and a time (like you provided the program) actually stores it as having a series of slots which the value gets stored in. The program then tries to figure out what slot corresponds to a time period using the retention data file. If it gets a data that doesn't exactly fit in a slot, i think what happens is it uses an average, min, or max depending on another file in the same directory as the retention file. I found that the best way to keep that from messing everything up was to submit data at a frequency that was much higher than the frequency at which graphite was storing data. It honestly gets super complicated - not only are there retention periods for graphite, and averaging algorithms that fill points (I think), but these values are ALSO applied to the whisper files. Very odd things will happen when these don't match, so until your config is working I would suggest deleting your whisper files repeatedly, and letting graphite recreate them.

This program really struck me as acting fairly buggy, so if you encounter something like this don't assume that it's your fault.