The Curious Case of the StatsD Timer

August 04 Bullet_white By Mark Bullet_white Posted in HowTo Bullet_white Comments Comments

statsd-timer

Instrumenting our application with StatsD is easy, especially when we just stick to Counters and Gauges. These metrics return just a single value when implemented. When you get to Timers, however, StatsD steps up it's game and returns eight metrics.

So let's explore the curious case of the timing metric. What do all these metrics mean? How can we use this for instrumenting our application?

Understanding the Timer

To start, let's assume our data set is:

   [1, 2, 3]

Of the 8 metrics generated - the first 5 metrics that we get back are self explanatory, but worth a review:

  • *.count: count of the items processed (e.g. 3)
  • *.max: The largest value (e.g. 3)
  • *.min: The smallest value (e.g. 1)
  • *.sum: Total of items (e.g. 6)
  • *.mean: average of the items. (e.g. 2)

The last 3 metrics are:

  • *.sum_95: The sum of values up to the 95th percentile
  • *.upper_95: The upper value of the 95th percentile group
  • *.mean_95: The average of values up to the 95th percentile

The 95th percentile represents the top 95% of the values within a group. For example, with a sample size of 20 items, it would be the top 19 values.

Using the Timer metric

It's been said before: one of the hardest things in programming is naming, and unfortunately, the Timing metric falls victim to this. As an example, consider this code:

 User.all.each do |user|
   statsd.timing 'user_logins', user.login_count
 end

We're looping thru each of our users and checking how many times a user has logged in. Run this in a cron job once a day and on your dashboard you'll have these metrics available:

  • A count of distinct users that logged in (e.g. user_logins.count)
  • The most logins (e.g. user_logins.max)
  • The least logins (e.g. user_logins.min)
  • Total # of logins (e.g. user_logins.sum)
  • Average # of logins (e.g. user_logins.mean)

As you can see, it doesn't really have anything to do with "timing" - but it does give us a powerful way to summarize our data.

We can us the same pattern across all business metrics:

Revenue? Add a timing metric to an Order object that has an price method - and let the timing metric do all the calculations for you.

Manufacturing? Use a Timer to easily summarize the number of widgets produced per day, or per month.

No matter what your domain is, StatsD Timer offers a powerful solution for instrumenting your application.

Quick Commercial Break:

You can take StatsD Timing even further with the StatsD aggregator built into our Scout agent. Get unified alerting and dashboards within minutes of adding your instrumentation code.

Not a Scout customer? Check out our free 14 day trial here - and get started with StatsD today!

Also See:

Here's a few more references to get you started with StatsD:

Questions? Comments? Contact us! For more information about StatsD and server monitoring, follow us on Twitter.

Comments

comments powered by Disqus