Last week, one of our application servers died. We have four app servers, so in theory, the death of one app server shouldn't bring the entire platoon down. However, real-life had other plans: 95% of requests were handled fine, but around 5% were being dropped. Here's the story of how we diagnosed and fixed the issue with our realtime charts.
Gavin Stark and the Real Digital Media team are Campfire power users. Back in April, Gavin created a Hubot script to send Scout alerts to their chatroom. Now, they needed visuals:
We wanted to grab some of the graphics that Scout’s in-app page offers to users to provide us with visual feedback on historical norms for each metric without leaving the chat room. A call to Scout support got graphics ported into the API in less than a day, and we were off and running.
His sparkline update hasn’t been merged into the Hubot repo yet, but you can view his fork
Josh Nichols of Rails Machine on their monitoring philosophy:
Measure all the metrics and alert on metrics that are actionable.
If you’ve got a bunch of servers, you’re going to want to read Josh Nichols’ How we roll with Scout article on the Rails Machine blog.
Rails Machine is a Ruby on Rails-focused managed hosting provider, which means they’re on the front lines when performance goes bad. From plugins to alerting, Josh details how they stay proactive on performance.
Want to get Scout alerts piped into your Campfire room? You can, thanks to Scout user Gavin Stark’s Hubot script for Scout. Gavin describes the advantages for his team at Real Digital Media:
Our support staff can now see the alerts from Scout as a team. We combine this with other monitoring services that monitor ping-ability and web response speeds.
Getting Scout alerts in Campfire means we can discuss them inline and respond quickly. We’ve found the immediacy of Campfire to be an improvement over email.
Want to get your Scout alerts in Campfire? You’ll need a Hubot.
Need to set up a Hubot?
is a program that listens in on your chat room. He responds to commands and can provide notifications. Your Hubot needs to run someplace. Most people set him up on Heroku, since it’s A) really simple; B) free!
- follow these instructions to deploy Hubot to Heroku.
- once your Hubot has joined your Campfire room and responds to commands (try: hubot help), continue below to configure Scout to talk to your Hubot.
Already Got a Friendly Hubot?
Two easy steps:
- add the Hubot Script for Scout into your Hubot’s
- in your Scout account, click on “Notifications,” then set the webhooks URL to
That’s it! Try creating a Scout trigger that fires immediately to test it out.
Thanks again to Scout user Gavin Stark for writing the Hubot script, and to Hubot maintainer Tom Bell for the quick merge.
Sometimes we have an immediate need to watch for a term in a log file. For example, if we’re doing a major deploy, we might watch for the term
error in a log file. We want to make sure the rate of errors doesn’t increase.
To do this, we’ll use Yaroslav Lazor’s Log Watcher Scout Plugin. We just had a great use case for this.
Yesterday we released a preview of Redwood, a MacRuby app we’re building. Redwood works just like Spotlight on your OSX desktop but searches the web apps we commonly use at Scout (Gmail, Google Docs, and Basecamp).
To track the number of downloads, we configured the plugin to watch for the term
Redwood.zip in the Apache access log and we’re tracking Redwood downloads:
This continues to be one of my favorite Scout plugins: the biggest reason we don’t monitor important metrics is that setting up monitoring is a pain. This plugin eliminates that excuse.
Last week we added a third web server to one of our reporting applications. We’ve been growing at a steady rate and we wanted to reduce the load across our web tier (losing one of the web servers could put too much traffic on the remaining server).
Before Will Farrington (one of the fine folks at Rails Machine) added the third web server to the load balancer rotation, we setup a couple of charts to watch the magic.
Scout’s charts now refresh as metrics are reported so we could quickly see the impact.
Did the third web server help? Here’s what we saw:
Our third web server helped decrease the load across our web tier:
Scout’s Server Load plugin is installed by default on your server.
We confirmed the change in request distribution across the 3 web servers:
Install either the Apache or Ruby on Rails Monitoring plugin to view request metrics.
We love seeing visual confirmation of a job well done!
You might be familiar with Linux load averages already. Load averages are the three numbers shown with the
top commands - they look like this:
load average: 0.09, 0.05, 0.01
Most people have an inkling of what the load averages mean: the three numbers represent averages over progressively longer periods of time (one, five, and fifteen minute averages), and that lower numbers are better. Higher numbers represent a problem or an overloaded machine. But, what's the the threshold? What constitutes "good" and "bad" load average values? When should you be concerned over a load average value, and when should you scramble to fix it ASAP?
Determining a web application’s hardware resources isn’t easy (or cheap). Frankly, it’s often just guesswork. Even when you build benchmarking scripts, they can miss key behaviors and ignore important metrics.
Scaling becomes a lot less stressful when you can quickly compare a history of your application data with server performance.
For example, we did this to get a better understanding of how our Scout server performed during our invitation process. The graph below was generated through Scout and shows the relationship between user accounts and the server load. As we expected, the overall load on the server increased as the number of accounts increased. Scout shows us how this data is correlated – it gives us an idea of how many accounts our current hardware can support.
Scout Accounts vs. Server Load
It’s trivial process to regularly feed Scout your application data (user signups, orders, revenue, etc):
- Start with this Rails App Plugin Sample (this assumes a Ruby on Rails application, but you can do this with any framework/language)
- Grab your application data – just use ActiveRecord!
- Put the plugin on your server (can protect behind basic auth)
- Add the plugin
We’re using Scout, our monitoring and reporting application, to graph the performance of our Rails applications and servers.
I’ve uploaded a video that looks at how one of our applications, PlaceShout, impacts the server load and Mongrel memory usage. I also compare PlaceShout’s footprint to another server.
Watch the video:
Graphing in Scout (1 min 47 sec)
Past Videos on Scout:
Installing the Scout Client (1 min 39 sec)
Installing the Rails Requests Plugin (1 min 55 sec)
Signup for our launch email list
We’ve started emailing invites to Scout. Signup on our homepage, and we’ll give you access to Scout before the public launch.