/ devops

LazyOps Micro Port Monitor in Ruby

So here's the scenario. I'm a lazy ops person, you probably are too. But I like to think of laziness in an elegant way. Solving problems in an automated fashion so you don't have to lose sleep at night. Recently with the Meltdown/Spectre patches being pushed out to cloud providers it was inevitable that my host (Digital Ocean) would be bouncing my servers of which there are 12. These are all unimportant and I could care less if they go down for a reasonable period of time.

However there is one server that I host for someone that cannot tolerate much downtime due to their business. In addition, this server uses Zimbra (like Exchange but open source) and services will not start via upstart/systemd/etc so you have to manually sudo su - zimbra then start all the services with zmcontrol which is a huge pain in the ass. Knowing Digital Ocean as I do, they will probably have some crazy 03:00 window when this server will bounce.

I was going to spend some time implementing Sensu or another monitoring framework but unfortunately didn't have the time to do it with work/life/etc. So I figured, why not write a quick and dirty portcheck script in Ruby and run it in cron?

This is relatively simple and took 10 minutes, but we'll step through it a bit so you can catch the drift.

#!/usr/bin/ruby

require 'socket'
require 'timeout'
require 'twilio-ruby'

@client = Twilio::REST::Client.new "your_twilio_sid", "your_twilio_token"

def port_open?(ip, port)
begin
   Timeout.timeout(5) do
      begin
         TCPSocket.new(ip, port)
      rescue Errno::ENETUNREACH
         retry
      end
   end
rescue
    @client.api.account.calls.create(
    from: "your_twilio_number",
    to: "your_target_number",
    url: "http://twimlets.com/message?Message%5B0%5D=The%20%2Cport%2C#{port}%2C%2Cis%20down%20on%20host%20#{ip}.&"
  )
end
end

if port_open?("your_ip_address_to_montitor", 143)
 puts "Port 143 is open and listening"
end

The goal of this script is to open a TCP socket, check if it's open, then output to stdout. If the port is not open, I want to get a phone call telling me what's down. I could have improved this with ARGV or other input methods to pass dynamic variables into the script, but hey, I'm lazy and just wanted something quick.

We're going to start with the first part of the code which is defining a method called port_open? which takes two arguments ip and port. First we wrap everyting in a Timeout.timeout(5) block which gives us 5 seconds to wait for it to complete it's run.

We begin a transaction by creating a socket with TCPSocket.net(ip, port) to attempt the connection, if it bombs, we use rescue with a network unreachable error and call retry on it. If that fails we move down the chain to another rescue so the script doesn't take a nosedive and we rescue the exception (i.e. the port not open or timing out) with a method from the Twilio-Ruby API wrapper that let's us create a phone call and pass TWIML to it with string interpolation.

Finally at the bottom of the script we fire off the method in a conditional in this case hardcoding the ip address and port. As mentioned this could all be ARGV or gets for input then having a persistent loop, but I only had about 10 minutes of free time to make this work. I'll be sure to improve it later on.

To get this script working you basically just need to install Ruby on a box (google it), then run gem install twilio-ruby since it's required by the script, then add the script to your crontab on your server, computer, whatever. You'll also need to open a Twilio account, buy a number, enable it for messaging, and get your sid and token from the API section in your account. But I'm confident you can figure this out :-)

Here is a link to the repo. Feel free to fork it, improve on it, or open PRs against it.

Anyways, 10 minutes to solve a problem wasn't bad. It's a hack at best, but it works like a champ until I have the time to do a proper Sensu/monitoring setup on these servers.

Happy Coding!