"Updates" Posts


Real Time Infographic

By Derek Bullet_white Posted in Updates Bullet_white Comments Comments

It’s been a little over a week since we rolled out real-time monitoring. Some people think it’s pretty cool.

To commemorate one week of real time, our art department put together a basic infographic on its usage. You’re spending more time watching lines move on the screen than YouTube Nyan Cat videos. We’re damn proud of that.

Learn more about real time server monitoring with Scout.

 

The Year At Scout: 2011 Edition

By Derek Bullet_white Posted in Updates Bullet_white Comments Comments

I’ll remember 2011 as the year of fine-tuning at Scout. From your experience with the end product to our experience delivering it, we invested a lot of our time sanding Scout.

Feature Highlights

  • Dashboards – Combine plugin displays and charts across servers for a complete view of your infrastructure’s health.
  • Easier Plugin Development – Since we started Scout in 2008, we’ve believed a big reason services go unmonitored is that writing and testing monitoring scripts is painful. We’ll continue working to make plugin development easier.
  • Scout API – With the scout_api Ruby gem, you can slice and dice your metrics as you see fit.
  • Fullscreen Charts – Because you look smarter with pretty lines moving on a wall-mounted plasma display.
  • Spotlight-like Server Navigation – You’re monitoring more servers with Scout, so we made it easier to navigate between them.

A Banner Year for Plugins: 200+ commits from 20 authors

With nearly 60 monitoring plugins in our directory, Scout’s breadth of coverage grew substantially in 2011. Some of the highlights:

But it’s not just new plugins that I was excited about: Scout plugins grew incredibly more robust. Our plugin repository on Github had more than 200 commits from 20 authors in 2011. Thanks for all of your pull requests and bug reports: there are few things I love more than solidifying Scout’s core.

Blog Post Highlights

We think it’s important for you to know how our brains function at Scout. Some of the highlights (or lowlights, depending on your interpretation):

Implementing our succession plan

We plan on Scout being around for a long time. 2011 saw the start of our grand succession plan: both Andre and I brought home baby girls this year. While my two-month old daughter is lacking focus at this point, I’m confident she’ll be ready to take over my share of Scout by 2035.

2012

I’ve never been more excited about Scout. We eliminated a lot of cruft from Scout in 2011 that’s freeing us up to make monitoring even easier in 2012. Thanks for your continued feedback, bug reports, and plugin commits: you’re making Scout better every year.

 

Our poor choice of words: "Server Down" alerts are now "No Data" alerts

By Derek Bullet_white Posted in Updates Bullet_white Comments Comments

What we’ve been referring to as “Server Down Notifications” suffer from two problems:

  • False Positives – The last thing you need is an alert from Scout at 3am telling you that your server is down when it isn’t.
  • An overaggressive email subject – The subject line of these emails looks like “Server is DOWN”. However, this isn’t accurate. We just know that Scout hasn’t received data from this server. That doesn’t necessarily mean the server is down.

We’re making two changes to these alerts:

  • The email subject now states “Server isn’t reporting”. This is really all we know.
  • These alerts used to fire when the agent didn’t report for five minutes. This was overly aggressive. While it doesn’t happen often, things can go wrong between your server and ours that can cause a reporting pothole. Alerts are now sent when the agent hasn’t reported for 30 minutes. It’s important to tell you when monitoring isn’t working, but not important enough to risk sending a false positive because of a short reporting outage.

So, what do we suggest to verify that a server is down? An external monitoring service like Pingdom is a good option. Services like Pingdom can alert you if a server can’t be pinged and/or isn’t accepting SSH/TCP connections. Failed external checks and a lack of internal metrics from Scout often indicate that a server really is dead.

We’re confident (1) keeping our finger off the alarm button a bit longer and (2) calling these alerts what they really are will give you a more comfortable experience. We’re developers too and we know the heartburn a 3am SMS causes.

 

2010 Scout Highlights

By Derek Bullet_white Posted in Updates Bullet_white Comments Comments

2010 was a busy year at Scout. The highlights:

Top Features

Top Plugins

View all Scout Plugins

Top Blog Posts

You gave us great feedback on your Scout experience in 2010. We’re taking that feedback and making it even easier to monitor your server cluster in 2011. Subscribe to our RSS feed or follow us on Twitter for the latest updates.

 

Simplify. Get an order of magnitude speedup.

By Andre Bullet_white Posted in Updates Bullet_white Comments Comments

Have you noticed Scout feels snappier lately? We made some major simplifications that sped things up a lot. Here’s the CPU load on one of our DB servers:

(and yes, we use Scout to monitor itself!)

Even better, the response time for users improved dramatically.

Scout’s longest actions before and after the speedup:

The simplification

What yielded such a dramatic speedup? Earlier this year, we implemented a very generic datastore and reporting system. It could handle all sorts of data, relationships within the data, etc.

Unfortunately, we never got to demonstrate all the benefits of this cool system. It wasn’t viable from either a maintenance or a performance standpoint.

So we rolled it back. And we got back a ton of performance, as you can see.

The lessons …

I will be writing up some business lessons we learned from this experience—stay tuned!

 

EC2 CloudWatch graphs, trends, and alerts

By Andre Bullet_white Posted in Updates, Plugins Bullet_white Comments Comments

If you're using Amazon EC2, you may be familiar with CloudWatch, Amazon's analytic system that provides metrics on CPU usage, Network I/O, and Disk I/O of your instances. While CloudWatch collects metrics, it doesn't provide a web interface for viewing the metrics, graphs, trending, or alerting.

Enter our Scout EC2 Cloudwatch plugin. Like any other Scout plugin, you can graph the resulting metrics, set triggers, track trends, and get email alerts when the numbers go out of bounds.


What does it monitor?

The CloudWatch plugin captures the following ("measures", as EC2 calls them): NetworkIn, NetworkOut, DiskReadOps, DiskWriteOps DiskReadBytes, DiskWriteBytes, CPUUtilization.

Note, this plugin does not fetch EC2 Load Balancer Metrics, only EC2 instance metrics.


Single Instance, Autoscaling Group, etc.

The EC2 CloudWatch plugin can capture metrics from a single EC2 instance, or it can aggregate metrics across a couple of dimensions. It can aggregate metrics across a given instance type, across all instances launched from a specific image (AMI), or by a specified autoscaling group. That means you can, for example, graph the performance of your application server autoscaling group as a whole, or graph just your memcached instance.

Enabling Cloudwatch

To use this plugin, you have to enable CloudWatch for the instance(s) you want to collect metrics from. See Amazon's CloudWatch docs for details. Basically, it's just ec2-monitor-instances ##### from the command line, or passing a monitoring parameter to the ec2-run-instances. It's covered nicely in Amazon's docs.

New to Scout?

If you're learning about Scout through this plugin, sign up for a trial Scout account to give this plugin a try. You can graph all kinds of metrics and measurements from all your servers. It works with cloud instances, VPS's, and dedicated hardware.

 

Older posts: 1 2 3 4 5 6