StatsD. Build vs. Buy?

By Mark Bullet_white Posted in Business Bullet_white Comments Comments

ben_franklin

"Remember that time is money."

A phrase used by Benjamin Franklin in Advice to a young Tradesman, written by an old One. Sage advice - and even after all these years, still relevant!

It's no surprise that we all look for tools that make our jobs easier, deliver value for our customers -- and ultimately save us time.

StatsD is one of those tools.

In a recent conversation with Martin Kelly, we talked about StatsD and how it saved him time.

Here's what he had to say:

"We don’t use StatsD for in-app instrumentation, but for user experience tracking. The client will ask us to measure a discrete element of the user experience, and with a couple lines of Python we can put together a test and start tracking it and graphing it within minutes."

Furthermore, he talked about the importance of dashboards and how visualizing the data helped him:

"In combination with the really quick-to-assemble dashboards in Scout, it allows us to see what system components are potentially affecting user experience, which does two things: lets us narrow down our diagnostic root through trivially easy correlation really quickly, and gives us much more user centered data for our capacity planning."

I asked him about the tools he was using before StatsD:

"We had a lot of Python tools already doing that job, but reworking the entire suite to work with StatsD was trivial. An hour from start to finish."

Summing up:

"Yes, I could probably cook something up similarly with Graphite, Fluentd, Sensu, Statsd and other parts, but the value for us is all that work is done. The cost for the offering over the whole estate is less than the engineering cost to maintain a pieced together solution."

A big THANKS to Martin for sharing his feedback. Internally, we've had the same experience with StatsD: Minimal setup and quick payback.

If you haven't tried StatsD yet, here's a few resources to get you started:

StatsD Intro Video

Docker Event Monitoring from scratch with StatsD

Scout StatsD Rack Gem

Give StatsD a try today! If you have any questions, contact us here, or follow us on Twitter for more StatsD news.

 

Monitoring InfluxDB with Scout

By Derek Bullet_white Posted in App Monitoring Bullet_white Comments Comments

We're using InfluxDB in our new app monitoring service.

While InfluxDB hasn't reached 1.0 yet, it has loads of potential and has been holding up well during our BETA period. Don't worry, we'll talk more about InfluxDB in coming posts.

So, how are monitoring InfluxDB performance? Here's how we get an overview of our app performance. InfluxDB is one of the categories we track:

overview

When there's a slow request, we can dig into details, including viewing the actual InfluxDB query:

slow

Need some InfluxDB monitoring action?

Sign up for early access and ping us at apm.support@scoutapp.com to let us know you are using InfluxDB.

 

Introducing easy StatsD with Scout

By Mark Bullet_white Comments Comments

The easiest way to unleash StatsD.

One agent. Minimal overhead. Robust language support. A unified monitoring solution for your servers and metrics.

We've added StatsD support to our monitoring agent. With Scout, you are just minutes away from StatsD-backed charts and alerts. Use StatsD to report code execution times, user signup rates, and more.

StatsD generated metrics are first class citizens in Scout - coexisting with every metric available via Scout.

Don't just take our word for it, Scout customer Martin Kelly had this to say:

It took about an hour to get all of our prod and preprod custom metrics into Scout and displayed on our dashboard. Both the client and project manager were very happy!"

Why StatsD + Scout?

Prior to today, all metrics in Scout were generated by our agent (monitoring system resource usage) or plugins (monitoring services). This works great for sampling metrics, but it's not a great fit for event-based metrics (ex: tracking user signups, response times, etc).

StatsD is a great fit for event-based metrics, but rolling your production-grade setup for StatsD is involved. We also want to see all of our metrics (and configure alerting) from a single app.

We tested StatsD+Scout internally first, loved it, and rolled out to customers during a preview stage. Today, StatsD is battle-tested and ready for your metrics.

Quick tip: replacing metric logging with StatsD

First time with StatsD? Here's a tip: if you are logging a metric, it probably makes more sense to send it StatsD.

Example:

logger.warn "Error Occurred!"

...becomes:

statsd

...which gives you ready-to-go charts and alerting (ex: alert when the error rate exceeds 50 errors/min).

Get started with StatsD today!

Additional resources to get you started:

Questions? Comments? Contact us at support@scoutapp.com.

More updates on StatsD? Follow us on Twitter.

 

The making of app monitoring: the health dashboard

By Derek Bullet_white Posted in App Monitoring Bullet_white Comments Comments

We're battle-scarred devs building the focused app monitoring service we've always wanted. We're blogging about the adventure below.

Customers telling me our app is slow? I'm looking at a response time graph.

On the front page of Hacker News? I'm looking at requests per-second and response time on a graph.

Lots of things going wrong? Show me ALL the metrics.

The challenges with building a one-page dashboard of app health?

  • What's important to me today might not be tomorrow
  • I need to see all key metrics at once to ensure I'm not missing a correlation (ex: spike in response time and error rates)
  • Ability to magnify a metric on a chart for more details

The first step is admitting I have a problem

We track eight key health metrics for our applications:

  • Response Time by category (time spent in Ruby, Postgres, Elasticsearch, etc)
  • Throughput
  • Error Rate
  • Apdex
  • Capacity % (the utilization of our application worker processes)
  • App Instances (how many processes are serving our app across all of our nodes)
  • CPU Usage % (average cpu usage of the app on each node)
  • Memory Usage (average memory usage of the app on each node)

So, what are some approaches to help me get an at-a-glance view of app health?

Read More →

 

Reversing the GoDaddy-ification of application monitoring

By Derek Bullet_white Posted in App Monitoring Bullet_white Comments Comments

Scout is an "oops" company. We didn't build our product with the intention of turning it into a company, but we certainly can't imagine life without it today.

Scout was started out of frustration. The prospect of setting up and using a Nagios-like server monitoring solution was so terrifying, we'd rather build our own. We built a simple monitoring agent and an accompanying Rails-backed web interface and used it monitor our own apps. When debugging performance issues, we started sharing access to the app with our hosting provider, Rails Machine. They loved it and started using it.

We really enjoyed building the product, put a price tag on it, and over a bit of time, it became our full-time thing.

That frustration is back: app monitoring

Application performance monitoring (APM) products have a tendency to evolve into a GoDaddy-like experience.

The tools for monitoring apps are continually becoming more complex and difficult to use. It's the second law of thermodynamics applied to software. There's an ever increasing tendency toward disorder.

It's time to reset application monitoring.

From our own frustrations and those our customers have shared with us, it's clear app monitoring needs a craft brew-alternative: a polished, focused take on application monitoring. A product focused on solving performance issues as fast as possible and not overwhelming you with clutter.

We're building the craft brew of app monitoring

A bit ago, we decided to build an app monitoring product. We've got an awesome team dedicated to it and it's coming along fast. There's some core beliefs we're staring with:

  • Support multiple languages and frameworks. We know from experience that we're mixing together more languages and frameworks than ever before. It's key to view their performance from a single interface.
  • Easy time range diffs. The UI must be built to make it easy to compare deploys, config changes, or general trends as an application ages.
  • Context. How is performance for our highest-paying customers? Is a performance issue impacting everyone or a subset of customers? Is slowness primarily associated with one database node? Make it easy to apply the context that matters to you.
  • Aggregrate what's slow. We learn a lot from investigating slow requests. Rather than paging through metrics on individual slow requests, aggregate the call stacks of slow requests together. Apply the context from above. Know with certainty that an endpoint is slow because of a specific query for X% of your customers.

Sign up for our BETA

Get yourself on our early access list. We'll be inviting folks into APM ahead of our October launch. It's a great time to help shape the direction of Scout APM.

More to come

We'll be blogging about the product dev process right here, starting with the design decisions behind our application health dashboard:

Scout APM

Follow us on Twitter for the highlights and signup for early access.

 

The Curious Case of the StatsD Timer

By Mark Bullet_white Posted in HowTo Bullet_white Comments Comments

statsd-timer

Instrumenting our application with StatsD is easy, especially when we just stick to Counters and Gauges. These metrics return just a single value when implemented. When you get to Timers, however, StatsD steps up it's game and returns eight metrics.

So let's explore the curious case of the timing metric. What do all these metrics mean? How can we use this for instrumenting our application?

Read More →

 

Older posts: 1 2 3 ... 52