Scout's top-secret 4-point observability plan

By Derek Bullet_white Comments Comments

Observability: the degree to which you can ask new questions of your system without having to ship new code or gather new data.

Above is my slightly modified definition of observability, mostly stolen from Charity Majors in Observability: A Manifesto.

Observability is increasingly important. Modern apps and services are more resilient and fail in soft, unpredictable ways. These failures are too far on the edges to appear in charts. For example, an app may perform dramatically worse for one specific user that happens to have a lot of associated database records. This would be hard to identify on a response time chart for apps doing reasonable throughput.

However, understanding observability is increasingly confusing. Sometimes observability appears an equation: observability = metrics + logging + tracing. If a vendor does those three things in a single product, they've made your system observable.

If observability is just metrics, logging, and tracing, that's like saying usability for a modern app is composed of a mobile app, a responsive web app, and an API. Authorize.net has those things. So does Stripe. One is clearly more usable than the other.

I think it's more valuable to think about how your existing monitoring tools can be adapted to ask more questions. There's significant room for this in standalone metrics, logging, and tracing tools.

At Scout, we've been thinking about how we can help folks ask more performance-related questions about their apps. We're not building a custom metrics ingestion system. We're not adding a structured logging service. We're focusing on our slice of the world.

Below I'll share our top-secret observability plan.

Read More →

 

Introducing Django & Flask Performance Monitoring

By Derek Bullet_white Comments Comments

7/31/18 Update: Python Monitoring is now GA and supports Django, Flask, Celery, Pyramid, Bottle and more.

GitHub's State of the Octoverse 2017 revealed that Python is now the second-most popular language on GitHub, with 40 percent more pull requests opened in 2017. We couldn't help but notice. Today, we're excited to add Python to our existing Rails Monitoring and Elixir Monitoring agents.

screenshot

To start, we're monitoring Django and Flask applications and their SQL queries, views, and templates, but our library coverage will increase as we near general availability. You can follow along and suggest what you'd like to see next on GitHub.

Scout isn't the first company to monitor Python applications. What's special about Scout is the focus. We've put an incredible effort into surfacing performance bottlenecks for you. This includes:

  • Identifying expensive N+1 database queries
  • Identifying slow queries
  • Finding the source(s) of memory bloat
  • Understanding outliers (ex: why is this endpoint slow for one user?)

...and more.

Relevant links

 

Rollbar+Scout: a legit New Relic alternative

By Jason Bullet_white Comments Comments

The New Relic price tag goes up dramatically as your server footprint grows. This might not be an issue if you are utilizing New Relic’s full product suite, but what if you just care about error and performance monitoring?

In that case, there's a solution that offers richer features as an alternative to New Relic. When you combine Rollbar (errors) and Scout (performance), you're choosing two best-of-breed, focused products that actually play well together.

First, let’s see what’s special about Rollbar’s error monitoring capabilities. Then, we’ll show how to combine Rollbar and Scout to give a unified app stability experience.

Read More →

 

Setting up a Rails app for CodeBuild, CodeDeploy, and CodePipeline on AWS

By Derek Bullet_white Comments Comments

If you’ve followed along with our previous episodes, we’ve covered many different aspects of setting up a production service. We’ve used many different products to simplify the day-to-day operations of running and maintaining an application.

We’ve used Scout for monitoring our application, LogDNA for aggregating our logs, HoneyBadger for our exception handling, and a host of AWS services for running our services, managing our SSL certs, hosting our Docker images, etc.

But one thing we haven’t focused on tidying up yet is one of the places we spend most of our time. Building features, merging those features, running tests, and deploying that code.

In today’s episode, we’ll be talking about how to use a few AWS services — including CodeBuild, CodeDeploy, and CodePipeline — to stream line getting features in front of our customers.

Read More →

 

5 traits of teams that make on-call less terrible for developers

By Derek Bullet_white Comments Comments

Over the past two weeks, there's been considerable discussion on whether developers should be on-call.

I understand the frustration of the anti on-call party. If you go to school to be a doctor, you know that being on-call is likely in your future. You didn't know that being on-call would be a part of your job when you started writing code.

on-call

Like the anti on-call party, I find no immediate joy being on-call. But today, you're swimming upstream if you are a developer and don't want to be on call. In Who owns on-call?, Increment surveyed over thirty industry leaders about their on-call rotations. All but one had developers on call (Slack), but that is changing:

Crowley says that they’ve recently started to see scalability problems with the old way of operations, however, which led Slack to create a secondary on-call rotation full of developers; software and performance bugs, he says, are becoming much more common than low-level infrastructure problems—bugs that only the development teams know how to fix.

To me, it's a no-brainer: if the root cause of most incidents are hardware and network partition failures, then it makes sense for operations to be on-call. They are the ones familiar with those systems. However, if the majority of problem lie within code, a developer needs to write the fix. The underlying infrastructure our apps sit on top of is becoming remarkably more reliable, which means developers are gaining more responsibility (that's a positive spin, right?).

Over the past decade, I've primarily worked on small, fast-moving development teams. I've always valued my time away from the office and believe our developers should too. Developers have always been apart of these on-call rotations and I haven't hated this experience. Below are five traits I've seen from teams that have a healthy relationship with being on-call:

Read More →

 

Deploying a Faktory worker to AWS Fargate

By Derek Bullet_white Comments Comments

Looking for a fresh, 2018 approach to deploying a Rails app to AWS? We've partnered with DailyDrip on a series of videos to guide you through the process. We're covering how to Dockerize a Rails app, AWS Fargate, logging, monitoring, setting up load balancing, SSL, CDN, and more.

In the previous post of this series, we deployed the Faktory service to AWS Fargate and created our first background job. Today we'll setup a Ruby worker service to pull jobs from the Faktory server and execute them.

Read More →

 

Older posts: 1 2 3 4 ... 68