New Plugin: Simple Port Check

By Andre Bullet_white Comments Comments

Simple Port Check is, well, simple: give it a list of ports, and it checks that each port will accept a TCP connection.

If the plugin detects that one or more ports stops accepting TCP connections, you’ll get an email notification. When the ports are available again, you’ll get another email:

The link again: Simple Port Check Plugin. Any questions or feedback, feel free to drop us an email.

 

Monitoring HAProxy

By Derek Bullet_white Posted in Plugins Bullet_white Comments Comments

HAProxy is a rock-solid, high performance TCP/HTTP load balancer. It’s what we use at Scout to proxy traffic to our app servers.

We’ve added a plugin to monitor HAProxy: just specify the URL to your HAProxy stats page and the name of the proxy. You’ll get the request rate, error rate, and proxy status for the specified proxy.

Why monitoring the load balancer is important

The load balancer is one of the best places to narrow down the cause of an outage. If all servers in a proxy aren’t processing requests, it’s often an indication of an outage further down the stack (ex: the database). If it’s isolated to a specific app server, the first place to look might be the downed app server.

The Scout HAProxy plugin gathers its metrics by parsing the output of the HAProxy Status page. It’s one of my favorite open source status pages: the page loads quickly and the color-coded statuses make it easy to focus on problem areas during an incident. The Scout plugin and the HAProxy status page play well together: if the plugin records a dramatic change in the throughput, error rate, or the proxy status, we’ll pull up the HAProxy status page and dig further.

Installation Notes

Like all Scout plugins, installing the HAProxy plugin is just a couple of mouse clicks away. Just click the button in the Scout UI and select the HAProxy Monitoring plugin. Note that the plugin has one dependency: the fastercsv gem must be installed. See the plugin directory entry for more details on the plugin.

Credit

A big thanks to Jesse Newland for the initial version of this plugin.

 

Our poor choice of words: "Server Down" alerts are now "No Data" alerts

By Derek Bullet_white Posted in Updates Bullet_white Comments Comments

What we’ve been referring to as “Server Down Notifications” suffer from two problems:

  • False Positives – The last thing you need is an alert from Scout at 3am telling you that your server is down when it isn’t.
  • An overaggressive email subject – The subject line of these emails looks like “Server is DOWN”. However, this isn’t accurate. We just know that Scout hasn’t received data from this server. That doesn’t necessarily mean the server is down.

We’re making two changes to these alerts:

  • The email subject now states “Server isn’t reporting”. This is really all we know.
  • These alerts used to fire when the agent didn’t report for five minutes. This was overly aggressive. While it doesn’t happen often, things can go wrong between your server and ours that can cause a reporting pothole. Alerts are now sent when the agent hasn’t reported for 30 minutes. It’s important to tell you when monitoring isn’t working, but not important enough to risk sending a false positive because of a short reporting outage.

So, what do we suggest to verify that a server is down? An external monitoring service like Pingdom is a good option. Services like Pingdom can alert you if a server can’t be pinged and/or isn’t accepting SSH/TCP connections. Failed external checks and a lack of internal metrics from Scout often indicate that a server really is dead.

We’re confident (1) keeping our finger off the alarm button a bit longer and (2) calling these alerts what they really are will give you a more comfortable experience. We’re developers too and we know the heartburn a 3am SMS causes.

 

Making Scout feel young again: our 4-part tonic

By Derek Bullet_white Posted in Development Bullet_white Comments Comments

Scout is no longer a puppy: in dog years, he’s old enough to drink, get drafted, and rent a car. During that time, cruft gathered around the edges of our server infrastructure.

We’ve been using a hodgepodge of server hardware, some performing multiple roles, some not, all individually configured and tuned. Small changes to our stack seemed to involve a lengthy checklist. Our staging environment didn’t mirror production: what happened on staging didn’t always happen on production. Finally, database changes were painful.

We wanted to get lean in the right places: could we make the young adult Scout as easy to manipulate as the baby Scout? 7.weeks.ago we followed a four-part process to get there.

Read More →

 

Monitoring Apache ZooKeeper

By Derek Bullet_white Posted in Plugins Bullet_white Comments Comments

With over 50 million plays, OMGPOP – the free multiplayer game site – is logging a lot of data. Tracking stats like app downloads and launches paint a picture of how their games are performing.

This logging data is collected via Flume, a system for collecting streaming data, and delivered to a Hadoop Distributed File System (HDFS). So, how do you keep your Flume nodes configured in a consistent manner?

Enter Apache ZooKeeper, “a centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services” (from the ZooKeeper homepage). Michael Fielder (Blog | Twitter), an NYC-based freelance systems operations engineering consultant, recently created a Scout Plugin for monitoring ZooKeeper at OMGPOP. The plugin parses the output of the srvr command on the installed server, reporting key ZooKeeper metrics. Additionally, an error is generated if ZooKeeper is not running.

Read More →

 

Dashboards (for Ultimate Accounts)

By Andre Bullet_white Posted in Features Bullet_white Comments Comments

On of the most common feature requests we get is the ability to place multiple charts on a dashboard page. We’re launching this feature today. It’s currently available for Ultimate accounts only.

What are Dashboards?

  • Multiple Charts: display any number of charts on one page. Arrange and resize the charts by dragging.
  • Plugins, too: want to display detailed information from plugins on one or more servers? Add any number of plugins to your dashboard alongside your charts.
  • External monitor ready: full-page mode gives you maximum real estate, in case you want to put a dashboard on a dedicated external monitor.
  • Auto-refresh: everything on dashboards auto-refreshes every five minutes.

Get started

Ultimate accounts can access Dashboards from the top menu:

Feedback?

As always, we welcome feedback on new features—feel free to drop us an email with your thoughts.

 

Older posts: 1 ... 29 30 31 32 33 ... 66