Date

TL;DR graphite graphs look wrong when viewing a time period larger than a few hours. hitcount is the cure.

It turns out counters are bucketed as hits per time slice, not as permanently incrementing numbers.
http://code.hootsuite.com/accurate-counting-with-graphite-and-statsd/ helps a lot to explain how this works. For example, why, when viewinga counter over many hours or days, the numbers start to look way different from what you expect.

To work around this, wrap the stat in a hitcount call.

This example tells me how many posts failed to publish, per hour, across all publishing platforms:

hitcount(sum(stats.posts.by_platform.stuck.*), "1hours")

Get a Ratio of Two Stats!

For example, we increment a "published" stat every time we publish a post, and a "stuck" stat every time we fail to publish (after giving up on retries). divideSeries does the job. This is my currently most useful graph - the ratio of all posts that got stuck, per hour ... code formatted for readability:

  divideSeries(
    hitcount(
      sum(stats.posts.by_platform.stuck.*),
      "1hours"),
    hitcount(
      sum(stats.posts.by_platform.*.*),
      "1hours")
  )

Or, to convert that to a percentage:

scale(
  divideSeries(
    hitcount(
      sum(stats.posts.by_platform.stuck.*),
      "1hours"),
    hitcount(
      sum(stats.posts.by_platform.*.*),
      "1hours")
  ),
  100.0
)

Comments

comments powered by Disqus