Date

TL;DR graphite graphs look wrong when viewing a time period larger than a few hours. hitcount is the cure.

It turns out counters are bucketed as hits per time slice, not as permanently incrementing numbers.
http://code.hootsuite.com/accurate-counting-with-graphite-and-statsd/ helps a lot to explain how this works. For example, why, when viewing a counter over many hours or days, the numbers start to look way different from what you expect.

To work around this, wrap the stat in a hitcount call.

This example tells me how many posts failed to publish, per hour, across all publishing platforms:

hitcount(sum(stats.posts.by_platform.stuck.*), "1hours")

Get a Ratio of Two Stats!

For example, we increment a "published" stat every time we publish a post, and a "stuck" stat every time we fail to publish (after giving up on retries). divideSeries does the job. This is my currently most useful graph - the ratio of all posts that got stuck, per hour ... code formatted for readability:

  divideSeries(
    hitcount(
      sum(stats.posts.by_platform.stuck.*),
      "1hours"),
    hitcount(
      sum(stats.posts.by_platform.*.*),
      "1hours")
  )

Or, to convert that to a percentage:

scale(
  divideSeries(
    hitcount(
      sum(stats.posts.by_platform.stuck.*),
      "1hours"),
    hitcount(
      sum(stats.posts.by_platform.*.*),
      "1hours")
  ),
  100.0
)

Comments

comments powered by Disqus