Scout customer Eric Lindvall wrote up a nice piece on finding and fixing memory leaks in god -- specifically, when issuing "god load" on long-running god processes. Give it a read, it provides good insight into the troubleshooting process and the tools he used. Eric points to the Scout graphs showing both the symptoms:
And the solution:
Who wouldn't love to see memory usage go down and stabilize like that? Eric also provides patches to god in case you're having similar issues. Check out his full writeup for details.
If you're having trouble with memory consumption of a specific process, check out Scout's Process Memory Usage plugin.
Before Scout, my experience developing software was primarily consulting. Success was measured by delivering software on time and on budget.
With Scout, a subscription-based service, my focus isn’t on scheduling. We are self-funded and we didn’t have the luxury of a venture-backed startup. We’re focused on figuring out which pieces of development work can increase revenue the most. What follows is how we’re approaching it.
1/27/09 Update – We’ve rolled this functionality out…see our announcement blog post
Some of our customers are monitoring a lot of servers with Scout. We’re psyched about this, and we’re looking at ways to make Scout even easier to use.
Bulk plugin management is a topic that comes up pretty often.
So we’re looking at the ability to make changes en masse to your servers:
- add / delete a plugin on all servers in one shot
- add / delete a trigger to a plugin across all servers
- clone all plugins and triggers from one server to another
We’d love your feedback. How do you want to be able to bulk update your plugins? Feel free to leave feedback in comments.
Sometimes you have actions in your Rails app that you know take a long time. Getting alerts on these actions is just noise.
With the updated Rails Monitoring Plugin, you can filter out any requests on which you don't want to be notified. You supply a regular expression, so you make as simple or complicated as you need to.
Update your Rails plugin
If you already have the Rails plugin installed, you need to update it. Go to plugin->code and click Update:
Then go to plugin->edit and click Update Options:
The new goodness is under "advanced options":
You provide a regular expression for actions you want don't want to be alerted on when they're slow. In the simplest case, don't even worry about it being a regular expression -- just provide a string you want to match. For example, if you don't want to be alerted to any slow actions with
admin in the URI, just put
admin in the Ignored Actions field.
More Complicated Matches
To ignore all actions under
admin and also
accounts/new, the regex is
(admin|accounts\/new). If you wanted to make sure
admin only matches paths starting with admin, just match the beginning slash:
If you're building a complicated regex, try it out separately to ensure it matches/doesn't match what you expect. I dig on Rubular for quick regex sanity checks. Of course, the plugin will tell you if your regex has a problem, but you'll get faster feedback by running it through Rubular.
Note that the match is case insensitive -- no need to worry about case.
Finally, note that excluded actions will still be analyzed in the daily Rails Analysis reports, so you'll still get metrics on them -- you just won't get email notifications for actions you already know are slow!
Google Analytics is an indispensable tool as you optimize the business side of your operation. If you haven't already set up goals in Analytics for viewing your pricing information, accessing the sign-up form, and signing up for an account -- go do it! It's vital information.
However, Google Analytics' goals have to be attached to a specific URL. What if there is no URL for an important goal? For example, the New Account goal for Scout is just the account/show page -- there's no specific URL to represent a newly created account.
When debugging performance problems, visualizing server metrics in a variety of ways is a critical part of isolating the cause:
- Visualizing variance
- Overlaying metrics to identify correlations
- Scaling to compare several metrics with different units
- Stacking graphs to visualize distributed setups
We’ve just released a major update to Scout’s charting functionality that makes it easier to analyze your metrics.
Two weeks ago I covered some of the business lessons learned from a large (~3 months) investment in new features, and the hard decision to roll them back. I discussed how you will underestimate the ongoing cost of complexity in your product, and how cool new capabilities don’t sell themselves.
Continuing this week—more insights gained from undoing development work.
There’s no shortage of resources comparing the MyISAM and InnoDB storage engines. You’ll quickly see it isn’t a black-and-white decision after reading through various discussions debating MyISAM and InnoDB.
Why is the decision so hard?
- Setting up your database is one of the first steps when building a web application. You probably don’t have a good idea on the database activity at this point, so you may have little data to work with.
- The ordering and number of statements can have a big impact on database performance. It’s difficult to simulate until you have real users.
However, there is a one case where choosing the wrong table type can be crippling.
Scout is making a better first impression than ever starting today. When you start monitoring a new server, you'll immediately get a high-level summary of the vital stats:
Scout reports this for you automatically. From there, you choose the deeper metrics you need, like Ruby on Rails monitoring, MySQL Slow Queries, Process memory usage, etc.
We’ve been deleting a lot of code from Scout. We’re ripping out major infrastructure, and in doing so, pulling the plug on functionality which, just six months ago, we believed would be crucial to our business. Most importantly, we’re simplifying the most complex, error-prone, and poorly-performing parts of the application. At the same time, our revenue and sales pipeline is growing at a faster rate.
How did this happen? How did we get to a place where we can remove code and functionality and see our business will grow because of it?
As they say, “mistakes were made.” You don’t get the satisfaction of throwing out a bunch of cruft and performance-degrading features without having gone through the pain of:
- Building those features in the first place.
- Fighting the performance problems for a few months before you realize its all untenable and come up with alternatives.
So yes, mistakes were made. But also, lessons were learned.