Written by: steve ross on January 28th 2009

I've been monkeying around with god for process monitoring. Regardless of how you feel about religious issues, this tool is a pure-Ruby way to accomplish some of what monit does. I'm not totally grokking certain aspects of either tool, but monit has a broad base of support. However, the script is of a proprietary form and ... well ... it felt sort of funny coming from a Ruby environment to have to hard code as much as I had to with monit.

By contrast, with god, I can DRY things up and account for environmental differences among the various servers (development, staging, production). I'm also able to define constants like RAILS_ROOT or RAILS_ENV that are natural to use coming from a Rails app. They're not baked in, but by simply setting:

RAILS_ENV = `hostname` =~ /\.local/ ? 'development' : 'production'

I can approximate a development/production environment selector. I see no way to do this in one script using monit.

Things Left Out of God

There are some things monit does that are not baked into god. Perhaps the biggest omission, and perhaps it's deliberate, is the lack of a web-hosted control panel. I personally don't miss it and think it's something of a security risk. Other things missing are mostly features that are baked into monit but that can be emulated in using god.

God Is Configurable

Although the god mantra is fully-configurable, it's not always clear how to do this. I did a bunch of source code reading, and I'm pretty sure I have it wrong. But here are two places where I got things to work that I thought couldn't be done with this tool:

Restart Process on File Mod

This is not so much a "configurable" option as it is an undocumented one. Take the case of an important batch process that's supposed to run every four hours. If it doesn't, the first option is restart and hope for the best. Here's the code:

w.restart_if do |start|
  start.condition(:file_mtime) do |c|
    c.interval = 5.minutes
    c.path     = IMPORTANT_PROCESS_STATUS_FILE
    c.max_age  = 4.hours
  end
end

The process itself is set to log to or touch the logfile when it finishes the process (because we want to make sure it didn't crash in the middle). What we're telling god is check every 5 minutes and if the log file is older than 4 hours, restart the process. Really, we don't have to check that often. It's just paranoia on my part.

Restart the Process on File Not Exist

Ok, the above scenario works if the log file exists, but borks if not. We can't have that, so we use what I feel is the biggest hammer: lambdas. Here's the code:

w.restart_if do |start|
  start.condition(:lambda) do |c|
    c.interval = 5.minutes
    c.lambda = lambda{!File.exist?(IMPORTANT_PROCESS_STATUS_FILE)}
  end
end

Just as easy, and I've coded my condition in Ruby. The code inside the lambda is completely arbitrary, so it is a big hammer. Use with caution and test.

Send Meaningful Email Notifications

The default email setup is described in the god documentation, but the email is pretty unhelpful. It just says what the condition was and ends there. Here's a more helpful version:

w.restart_if do |start|
  start.condition(:lambda) do |c|
    c.interval = 5.minutes
    c.lambda = lambda{!File.exist?(IMPORTANT_PROCESS_STATUS_FILE)}
    c.notify = {
      :contacts => ['developers'],
      :subject  => 'The World Bank -- Restarting Account Posting for Everybody!',
      :message  => 'The posting status log appeared to be missing -- restarting it so the economy will recover.',
      :priority => 1,
      :category => 'crisis'
      }
  end
end

I hope this stuff is useful and if there are better ways, I'd sure like to hear them.