Once your Rails app begins seeing consistent traffic, you're bound to have slow SQL queries. While PostgreSQL and MySQL can log slow queries, it's difficult to gleam actionable information from this raw stream. The slow query logs lack application context: where's the LOC generating the query? Is this slow all of the time, or just some of the time? Which controller-action or background job is the caller?
The Django ORM makes it easy to fetch data, but there's a downside: it's easy to write inefficient queries as the number of records in your database grows.
One area where the ease of writing queries can bite is you is with N+1 queries. Expensive N+1 queries go undiscovered in small development databases. Finding expensive N+1 queries is an area where Scout is particularly helpful.
Prior to adding Python performance monitoring, we'd written monitoring agents for Ruby and Elixir. Our Ruby and Elixir agents had duplicated much of their code between them, and we didn't want to add a third copy of the agent-plumbing code. The overlapping code included things like JSON payload format, SQL statement parsing, temporary data storage and compaction, and a number of internal business logic components.
This plumbing code is about 80% of the agent code! Only 20% is the actual instrumentation of application code.
So, starting with Python, our goal became "how do we prevent more duplication". In order to do that, we decided to split the agent into two components. A
language agent and a
core agent. The language agent is the Python component, and the core agent is a standalone executable that contains most of the shared logic.
Browser development tools - like Chrome Dev Tools - are vital for debugging client-side performance issues. However, server-side performance metrics have been outside the browser's reach.
That changes with the Server Timing API. Supported by Chrome 65+, Firefox 59+, and more browsers, the Server Timing API defines a spec that enables a server to communicate performance metrics about the request-response cycle to the user agent. When you use our open-source Ruby or Elixir server timing libraries, you'll see a breakdown of server-side database queries, view rendering, and more:
Combined with the already strong client-side browser performance tools, this paints a full picture of web performance.
Get started with Scout's server timing libraries:
A Scout account isn't required, but it does make investigating slow response times more fun.
Observability: the degree to which you can ask new questions of your system without having to ship new code or gather new data.
Observability is increasingly important. Modern apps and services are more resilient and fail in soft, unpredictable ways. These failures are too far on the edges to appear in charts. For example, an app may perform dramatically worse for one specific user that happens to have a lot of associated database records. This would be hard to identify on a response time chart for apps doing reasonable throughput.
However, understanding observability is increasingly confusing. Sometimes observability appears an equation: observability = metrics + logging + tracing. If a vendor does those three things in a single product, they've made your system observable.
If observability is just metrics, logging, and tracing, that's like saying usability for a modern app is composed of a mobile app, a responsive web app, and an API. Authorize.net has those things. So does Stripe. One is clearly more usable than the other.
I think it's more valuable to think about how your existing monitoring tools can be adapted to ask more questions. There's significant room for this in standalone metrics, logging, and tracing tools.
At Scout, we've been thinking about how we can help folks ask more performance-related questions about their apps. We're not building a custom metrics ingestion system. We're not adding a structured logging service. We're focusing on our slice of the world.
Below I'll share our top-secret observability plan.
GitHub's State of the Octoverse 2017 revealed that Python is now the second-most popular language on GitHub, with 40 percent more pull requests opened in 2017. We couldn't help but notice. Today, we're excited to add Python to our existing Rails Monitoring and Elixir Monitoring agents.
Our Python support is currently in tech preview: this means it is free to use, but also brand new and not yet feature-equivalent to our Ruby and Elixir monitoring agents. To start, we're monitoring Django and Flask applications and their SQL queries, views, and templates, but our library coverage will increase as we near general availability. You can follow along and suggest what you'd like to see next on GitHub.
Scout isn't the first company to monitor Python applications. What's special about Scout is the focus. We've put an incredible effort into surfacing performance bottlenecks for you. This includes:
- Identifying expensive N+1 database queries
- Identifying slow queries
- Finding the source(s) of memory bloat
- Understanding outliers (ex: why is this endpoint slow for one user?)