EC2 CloudWatch graphs, trends, and alerts

By Andre Bullet_white Posted in Updates, Plugins Bullet_white Comments Comments

If you're using Amazon EC2, you may be familiar with CloudWatch, Amazon's analytic system that provides metrics on CPU usage, Network I/O, and Disk I/O of your instances. While CloudWatch collects metrics, it doesn't provide a web interface for viewing the metrics, graphs, trending, or alerting.

Enter our Scout EC2 Cloudwatch plugin. Like any other Scout plugin, you can graph the resulting metrics, set triggers, track trends, and get email alerts when the numbers go out of bounds.


What does it monitor?

The CloudWatch plugin captures the following ("measures", as EC2 calls them): NetworkIn, NetworkOut, DiskReadOps, DiskWriteOps DiskReadBytes, DiskWriteBytes, CPUUtilization.

Note, this plugin does not fetch EC2 Load Balancer Metrics, only EC2 instance metrics.


Single Instance, Autoscaling Group, etc.

The EC2 CloudWatch plugin can capture metrics from a single EC2 instance, or it can aggregate metrics across a couple of dimensions. It can aggregate metrics across a given instance type, across all instances launched from a specific image (AMI), or by a specified autoscaling group. That means you can, for example, graph the performance of your application server autoscaling group as a whole, or graph just your memcached instance.

Enabling Cloudwatch

To use this plugin, you have to enable CloudWatch for the instance(s) you want to collect metrics from. See Amazon's CloudWatch docs for details. Basically, it's just ec2-monitor-instances ##### from the command line, or passing a monitoring parameter to the ec2-run-instances. It's covered nicely in Amazon's docs.

New to Scout?

If you're learning about Scout through this plugin, sign up for a trial Scout account to give this plugin a try. You can graph all kinds of metrics and measurements from all your servers. It works with cloud instances, VPS's, and dedicated hardware.

 

RubyKaigi 2009 wrap up

By Andre Bullet_white Comments Comments

James Gray's July 19th talk at RubyKaigi 2009 focused on best practices for long-running Ruby daemon processes.


What types of questions did the audience ask? What did they seem most interested in?

In general, users always want to know about our RRD usage, extracting the daemon functionality from Scout's agent, and the agent's memory usage. It was the same at RubyKaigi. The questions reminded me of how much current Ruby RRD solutions suck and that it's time we did something about that. It also reminded me that I need to get around to extracting our daemon code, which I've always intended to do.

Read More →

 

Moving from FiveRuns to Scout

By Derek Bullet_white Comments Comments

As FiveRuns posted on their blog they have announced End-of-Life for FiveRuns Manage. We have made arrangements with FiveRuns to ease the transition for customers who still need a robust, easy-to-use monitoring solution.

For current Fiveruns customers, we are offering 50% off your first paid month here with Scout . Note that this is only for current FiveRuns Manage customers, and that the offer expires in one week (August 19th). Of course, like any other Scout signup, it’s risk-free: your first month is free (and your second month is half-off) and you can cancel, upgrade, or downgrade at anytime.

FiveRuns Manage customers: use your discount code on our signup page, and welcome to Scout!

Getting Started

Getting started with Scout is very straightforward, and the signup process guides you through all the steps. The main difference from FiveRuns Manage is that you choose the components you want to monitor by selecting plugins. You can add or remove plugins at any time, and we offer some suggestions for getting started below.

Your basic process is this:

  1. Install the gem: sudo gem install scout_agent and start it with the server key you’re given on signup
  2. Select one or more plugins from the directory. The Server Load, Disk Usage, and Memory Profiler are easy plugins to get started with.
  3. Customize or add Triggers. Scout uses triggers to alert you of spikes or trends in the data being gathered—for example, “alert me when the five-minute load average exceeds 4.0” Plugins come with default triggers, and you can customize all you need.

Let us know if you have questions!

 

Understanding Linux CPU Load - when should you be worried?

By Andre Bullet_white Posted in Examples Bullet_white Comments 16 comments

You might be familiar with Linux load averages already. Load averages are the three numbers shown with the uptime and top commands - they look like this:

load average: 0.09, 0.05, 0.01

Most people have an inkling of what the load averages mean: the three numbers represent averages over progressively longer periods of time (one, five, and fifteen minute averages), and that lower numbers are better. Higher numbers represent a problem or an overloaded machine. But, what's the the threshold? What constitutes "good" and "bad" load average values? When should you be concerned over a load average value, and when should you scramble to fix it ASAP?

Read More →

 

Presenting the Scout Agent at Ruby Kaigi - Tokyo, Japan July 19

By Derek Bullet_white Posted in Updates Bullet_white Comments Comments

Scout takes a trek to Ruby’s birthplace – Japan – as James Gray presents How Lazy Americans Monitor Servers at the sold out Ruby Kaigi.

James’ July 19th talk focuses on the architecture of the Scout agent, the Ruby gem that is installed on a server you wish to monitor using Scout.

James will dig into the technical details of the agent’s division of labor approach for preventing memory leaks and crashes.

 

Q&A with the Scout Agent - An overview

By Derek Bullet_white Posted in Updates Bullet_white Comments 4 comments

Our recent update to Scout featured a revised UI, more functionality, and a new Scout Agent. While it’s easy to see the changes in the UI, a lot of the work conducted by the agent happens beneath the surface.

The Scout Agent, which is installed on a server you wish to monitor, was kind enough to sit down and walk me through its DNA (note that the ability to answer human questions is currently not available in the most recent release).

First, tell me a bit about what you’re made of.

I’m just a plain-old Ruby gem that you can install on any Linux-based server (sudo gem install scout_agent).

So, you’re a daemon right? Aren’t long-running Ruby tasks known to leak memory?

Yes, I’m a daemon. And yes, Ruby, like many programming languages, can leak memory when run for a long period of time.

My strategy for preventing memory leaks is simple: I do real work, like running plugins, in a separate short-lived process. I fork(), do whatever, and exit() so the OS can clean up any mess.

What’s your strategy to prevent the agent from crashing? Obviously, it’s important that monitoring software keeps running.

My work is divided into 2 main processes and several short-lived processes:
  • Lifeline – A single process that watches over all other agent processes. If a process fails to check-in with the lifeline regularly, I force it to stop and replace it with a healthy process.
  • Master – This is the event loop of the agent and is the main process monitored by the lifeline. It just sleeps and runs plugins in a never-ending cycle.
  • Missions – These processes execute the plugin code. These are small processes that exist only when plugins are running.

The reason for this division of labor? The real work is executed by the mission processes, which are short-lived. By offloading the work to such processes, the potential for degrading performance and a plugin’s execution raising an exception and killing me off is greatly reduced.

It’s easier to write 200 lines of bug-free code than 3000. The 200 LOC (my lifeline) keeps the rest alive.

Read More →

 

Older posts: 1 ... 42 43 44 45 46 ... 68