Making Scout feel young again: our 4-part tonic
Scout is no longer a puppy: in dog years, he’s old enough to drink, get drafted, and rent a car. During that time, cruft gathered around the edges of our server infrastructure.
We’ve been using a hodgepodge of server hardware, some performing multiple roles, some not, all individually configured and tuned. Small changes to our stack seemed to involve a lengthy checklist. Our staging environment didn’t mirror production: what happened on staging didn’t always happen on production. Finally, database changes were painful.
We wanted to get lean in the right places: could we make the young adult Scout as easy to manipulate as the baby Scout?
7.weeks.ago we followed a four-part process to get there.
1. Homogeneous Servers
In our old setup, one app server had 8 CPUs and 2 GB of memory. Another had 4 CPUs and 4 GB of memory. Another had 2 CPUs and 2 GB of memory. The varying hardware profiles had a number of disadvantages:
- Complex Tuning: Each of our servers needed to be tuned individually (number of Passenger processes, the size of the Apache worker queue, etc). Every time we added an app server, we needed to rebalance traffic sent from our load balancer, HAProxy, to our app servers since they could each handle different amounts of traffic. It was difficult keeping track of each server’s configuration.
- Debugging Performance Problems: We were conducting a poorly run science experiment: with the individualized configuration, varying hardware, and differing amounts of traffic each app server was handling, issues might pop up on one server but not another.
Now, each of our servers perform a single role and each server that performs that role has the same hardware profile. We only need to remember one configuration profile. Testing performance enhancements is easier: we can try something on one server and be reasonably confident the impact will be the same on others with the same role.
2. Configuration Management with Moonshine
Like an old married couple debating the merits of some unfamiliar but oft-lauded item, Andre and I were reluctant to use a configuration management tool to manage our infrastructure. Neither of us was excited about performing the plumbing work involved in documenting our setup in code.
Our host, Rails Machine, said they’d do the configuration for us using Moonshine, their Puppet-based solution for managing server infrastructure. The process went something like this:
- We gave Rails Machine access to Scout’s repository on Github
- Rails Machine forked the repo, creating a “moonshine” branch
- Rails Machine wrote their infrastructure codes
- We meet with Rails Machine over Campfire, trying Scout deploys on the new infrastructure.
- With working Moonshine examples in our code base to refer to, we configured some bits on our own. Rails Machine reviewed our commits.
- We merged the “moonshine” branch into “master”. Complete!
We were happy paying Rails Machine to Moonshine Scout: it’s their area of expertise. It’s their passion: they wrote Moonshine. Additionally, it’s much easier for Rails Machine to support us now as well: their support staff can look at Scout’s Moonshine configuration as the single blueprint of our infrastructure. If they need to make adjustments, they can do that by submitting a pull request.
3. A proper staging environment
Our previous staging environment did not resemble our production environment: the entire stack was on one server. Now, it also has a homogeneous setup that mirrors production but with lower-powered servers. This mirrored, lower-powered setup has a number of advantages:
- A testbed for configuration changes (important with Moonshine)
- Easier load testing: we can get production-like loads on our staging servers with far less traffic
4. Multi-Master Replication Manager for MySQL
When we were young, we could run
ALTER TABLE … on a whim: not anymore. Large schema changes are slow and can have a dramatic impact on Scout’s performance while they run. To give us more flexibility to tune MySQL, Rails Machine added Multi-Master Replication Manager (MMM) to our MySQL stack. This dramatically improved our ability to tune MySQL in two key ways:
- A tuning test bed: we can easily try changes on our database reader first, make it writer, and ensure things are working. If things go south, it’s just a single command to return to the previous writer.
- Zero downtime schema changes (almost): One of the big headaches of a decently-sized web app is making large database schema changes. These can take a while on a live database server, either locking tables or slowing down database performance considerably. With MMM, you can make large changes that result in almost no downtime. The almost? You’ll lose a handful of requests when the writer role changes.
Working with an app where (1) the stack and configuration is easy to understand (2) is easy to change (3) is easy to test (4) and tuning isn’t a “Hold on to your butts” moment is a lot of fun. It’s not limited to web apps just out of the womb.