Rails Performance and the root of all evil

May 09 Bullet_white By Sudara Bullet_white Comments Comments

Donald Knuth wrote an often quoted paper in the 70s which is still referenced when talking about performance in web apps today.

Premature optimization is the root of all evil.

In my line of work, it is sometimes invoked as a sort of apology; an excuse for why more time wasn't spent on performance: "This sucks, but at least we didn't.....prematurely optimize!"

My job is to fix performance issues and help other developers write well-performing code, so I'm not a huge fan of this quote. We shouldn't let an out of context quote guide our development strategy, no matter the source!

When things bother me, I like to stop to take a closer look. Here's the full quote:

We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil.

I translate this to: "When optimizing, don't get distracted by things that don't matter."

I would never translate this to: "Hold off on optimizing things until it really hurts."

Identify What Matters

Some developers see performance work like they see taxes. It's the annoying obligation that comes after code has been written. The fun is over, your bad decisions have come back to haunt you, and you are stuck fiddling with technical trivialities that you wish would just go away.

Other devs stay away from performance work, remaining blissfully ignorant. They assume it is difficult, mysterious and elusive.

But what do we mean by performance work? I break it into 2 types:

  1. "Baking in" performance with best practices and experience. In Rails, this would be proactively using pagination, having your db indexes in order, having a good idea what SQL Active Record is firing, using .includes where necessary, keeping track of where caching is going to be useful, etc. Not baking in these best practices along the way is akin to taking a mortgage out on your app. You are instantly creating debt and your options will be to "pay it off later" or bankruptcy.

  2. Measuring, identifying and handling performance issues. This is what I call "the problem of figuring out what the problem is." This is the work that can feel difficult, cumbersome and mysterious.

It's much easier to build performance into an app than it is to hunt down random performance problems as they bubble up. Baking in optimization is not what Knuth means by "premature." Let's see a bit more context around the Knuth quote:

We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil.

Yet we should not pass up our opportunities in that critical 3%. A good programmer will not be lulled into complacency by such reasoning, he will be wise to look carefully at the critical code; but only after that code has been identified.

The context here is how to approach optimization, not whether to optimize or not. I would summarize his advice as:

  1. Don't be complacent (by avoiding optimization altogether)
  2. Once optimizing, it's easy to spend hours optimizing things that don't matter (this is evil)
  3. Instead, immediately identify the critical 3% that requires optimization

Measure Twice, Cut Once

Knuth is on a roll, so let's give him a bit more page time:

It is often a mistake to make a priori judgments about what parts of a program are really critical, since the universal experience of programmers who have been using measurement tools has been that their intuitive guesses fail

Got that? Knuth wants us to measure and discover where the problems are, not guess at the issues and dive right into trying to fix them.

I'm a huge fan of using my gut to guide investigation into issues — but it's never worth refactoring or committing code until there is tangible proof of where the problem is. It's normal to be wrong, or for the story to be more complicated than you expect.

This is another reason why performance issue work might sound "hard." Endless hours spent down dead ends chasing "leads", fiddling around with things, coding and coding and coding...

Here are other things I see hours wasted on that I recommend skipping:

  • Calling a meeting to discuss where the slowdowns might be (unless someone is bringing data/answers)
  • Prototyping an abstraction to bypass a performance problem (or to preempt a future one)
  • Blaming the framework or getting sidetracked by the framework's inherent compromises
  • Assuming the scope of the problem is big or small before identifying where the problem is

Here is what I find to be the quickest path towards issue resolution:

  • Go as big picture as possible and start collecting evidence
  • Look at existing measurements from APM services like Scout, New Relic, Skylight
  • You can also work locally, using logs or other tools
  • Keep narrowing down the problem until you have one line or area at fault

If you don't have enough measurement around the problem, it's easy to make more. For example, once you know which action you care about, add rack-mini-profiler to your Gemfile, hit a few pages, and immediately get more detail.

If more than an hour or two has gone by and you have no idea where the problem is, don't despair. You can always try the "The Rude Macgyver." Delete half of the code path, check to see if the problem disappeared. If it did, you know the problem is in the removed code, so put half of that back in and then evaluate. Repeat until you have narrowed down a line or two to blame. In a Rails app you can rip out partials, or the view entirely, comment out parts of the controller, old helper methods, before filters, etc.

Evil Optimizations

I’ve seen many underperforming Rails apps. In most cases, I walk away certain that the app, the business and the customers would be better off if performance issues were treated more proactively vs. reactively.

The favorite part of my job is when developers I work with agree and code more defensively, catching performance issues before they are introduced into the codebase. However, along with this increased attention, a new risk emerges — this is what Knuth is actually warning us about. The risk of eagerly optimizing "all the things" without measurement.

Here's a list of common "optimizations" I regularly see Rails devs tempted to chase down:

  • Spending days learning about and profiling ruby memory leaks (Instead of measuring app code and rewriting some poorly crafted Active Record)
  • Figuring out how to display 5000 rows of something on a page — it should be possible! (It is, but it will always be "lipstick on pig")
  • Evaluating whether an alternative database is necessary to get decent performance on what is a straightforward CRUD app (You can run, but you can't hide!)
  • Building abstractions to work around performance problems (A great way to compound vs. reduce tech debt)
  • Being distracted by which ruby object type might be allocating more memory (Spoiler Alert: This probably isn't the reason why the Rails processes are over 800MB...)
  • Obsessing over rewriting parts of the app in Rust or C (Would a simple in-place refactor get us 95% of the way there? Or is this a number-crunching part of the app that warrants the big guns?)

Disclaimer: You may like these challenges! By all means, geek out, measure and explore. But most modern web apps do not require these types of optimizations. Default to spending time evaluating database relationships and Active Record usage, figuring out if the UI is sane, or measuring a custom abstraction that someone in the company built on top of Rails 3 years ago.

Non-Evil Optimizations

If your app has customers and is growing, and you don't proactively spend time optimizing popular features, heads up: it's likely you will eventually get complaints, downtime, and potentially a loss of business as your dataset and usage grows. It's never premature to "drive defensively." Optimizations like these are probably never going to be waste of your time:

  • Ensuring that a popular action stays under 300ms for 95% of users
  • Implementing caching for a page that is hit often and doesn't change regularly
  • Increasing RAM as your database grows in size
  • Ensuring that Rails instances stay at 50-70% capacity, so you have headroom for a traffic spike
  • When expanding or adding features, evaluating the performance of the existing feature to see if it needs cleanup/refactoring first
  • Monitoring your app regularly and fixing low hanging fruit like n+1s, lack of pagination, etc.

Nothing But the Knuth

My advice is to take a reasonable amount of time to bake in performance as you build. When trouble arises, figure out what the critical 3% is via measurement. Do this before allowing yourself to worry about esoteric topics or starting to shuffle code around.


Sudara Williams is a full-stack kinda guy who enjoys chewing on the tough bits at Rubytune, a Rails performance consultancy. He wrote his first (and last) programming book in the 6th grade. He built and managed an IT support business in Santa Fe, then returned to app development and fell in love with ruby in 2005. He later worked at Engine Yard, helping clients such as Github, Groupon and New Relic do the hard stuff. He runs Ramen Music and alonetone, the open source artist platform.

Comments

comments powered by Disqus