Q&A with the Scout Agent - An overview

July 08 Bullet_white By Derek Bullet_white Posted in Updates Bullet_white Comments 4 comments

Our recent update to Scout featured a revised UI, more functionality, and a new Scout Agent. While it’s easy to see the changes in the UI, a lot of the work conducted by the agent happens beneath the surface.

The Scout Agent, which is installed on a server you wish to monitor, was kind enough to sit down and walk me through its DNA (note that the ability to answer human questions is currently not available in the most recent release).

First, tell me a bit about what you’re made of.

I’m just a plain-old Ruby gem that you can install on any Linux-based server (sudo gem install scout_agent).

So, you’re a daemon right? Aren’t long-running Ruby tasks known to leak memory?

Yes, I’m a daemon. And yes, Ruby, like many programming languages, can leak memory when run for a long period of time.

My strategy for preventing memory leaks is simple: I do real work, like running plugins, in a separate short-lived process. I fork(), do whatever, and exit() so the OS can clean up any mess.

What’s your strategy to prevent the agent from crashing? Obviously, it’s important that monitoring software keeps running.

My work is divided into 2 main processes and several short-lived processes:
  • Lifeline – A single process that watches over all other agent processes. If a process fails to check-in with the lifeline regularly, I force it to stop and replace it with a healthy process.
  • Master – This is the event loop of the agent and is the main process monitored by the lifeline. It just sleeps and runs plugins in a never-ending cycle.
  • Missions – These processes execute the plugin code. These are small processes that exist only when plugins are running.

The reason for this division of labor? The real work is executed by the mission processes, which are short-lived. By offloading the work to such processes, the potential for degrading performance and a plugin’s execution raising an exception and killing me off is greatly reduced.

It’s easier to write 200 lines of bug-free code than 3000. The 200 LOC (my lifeline) keeps the rest alive.

How much memory do you use?

I typically use between 15-20 MB of memory.

Can you tell me a bit about how you actually run the plugins configured in the web interface?

Every 3 minutes, I wake up and contact the Scout server and grab the plugins I’m supposed to run. I then run each plugin and report the data back to the server.

Security-wise, how do you ensure that only the Scout server is sending plugins?

I validate the SSL certificate of the Scout server. I won’t send any data to or take any data from a source that fails this validation. This prevents a man-in-the-middle attack.

What if someone maliciously updates the code for a plugin?

The server caches the plugin code when you install the plugin. The code isn’t updated unless you choose to do so. Most plugins have just a few lines of code – usually less than 50 – so it’s easy to do a review. All of the plugins listed in Scout’s plugin directory are also maintained exclusively by the Scout team.

What user does the agent run as? root?

The agent doesn’t run plugins as the root user – by default, it uses the daemon user. You can change the user in the /etc/scout_agent.rb file.

What if things aren’t working as expected? Do you have a log file?

Yes – I record every plugin run in my log file, which is typically located in /var/log/scout_agent/scout_agent.log.

How can I get a closer look at you?

I’m open source and available on GitHub.

Also See

Get notified of new posts.

Once a month, we'll deliver a finely-curated selection of optimization tips to your inbox.

Comments

  1. Jesse Newland said about 2 hours later:

    Why hello, scout_agent! I really quite fancy the fork()/exit() strategy for preventing memory leaks! Thanks for the hint.

  2. aleco said about 2 hours later:

    While the scout_agent structure feels pretty clean now, its total memory usage (30-40MB for lifeline+ master+3 plugins together were pretty common) forced me to disable it on my 256MB VPS a few days ago. I guess I’ll have to replace it with a plain old cronjob which checks load and mem usage, as the concept of running such tasks in a ruby environment is simply a waste of RAM which you shoud use for an increased innodb-buffer or another mongrel/passenger instance instead (at least on servers with < 512 MB).

  3. Derek (Scout) said about 3 hours later:

    Aleco,

    If you’re using ps to look at the memory usage, Scout look a lot worse than it is. Here’s a great article about why that is, if you want to read more:

    http://wiki.marandcustomsolutions.com/space/Linux/Understanding+memory+usage

    The issues brought up there apply doubly so to the agent, since all agent processes are forked off of a common startup sequence (so they share a lot). In fact, a forked process itself is done as a “copy-on-write” entry in the process table where only new memory needs to be updated.

    Given all of that, the agent’s memory is significantly lower than something like ps will show.

  4. Kenneth Kalmer said about 4 hours later:

    Awesome way of doing this, will definitely be highlighting this to the daemon-kit users.