Quick Functional Tests in Saltstack

SaltStack is pretty awesome. If you don’t know what it is and are looking for a configuration manangement tool, I highly recommend it.

All of the major configuration management tools (Puppet, Chef, Ansible, Salt) are mostly the same, but the saltstack community is great. The IRC channel on freenode is an excellent resource, and their github issue tracker is actively monitored if you submit a bug report (or even a question, although you shouldn’t use the issue tracker for questions).

Anyway, a project I’m working on has become complex enough that I really need to start writing functional tests. Salt has some built-in testing classes that you can use and extend but they are mostly for unit and integration testing to my knowledge. I’m mostly interested in simple functional testing right now — ie: After I run my “highstate”, that does a ton of complicated things to a cluster of 10 servers, each with unique roles, does the website respond correctly?

That is to say, I need to go above and beyond the result output / return status of my highstate runs and orchestrations and actually test if I seem to have achieved the state I desire on the remote minions.

I don’t need fancy testing, or continuous integration right now. What I need are just some simple functional tests. When I started thinking about this tonight I realized I would definitely need access to salt’s internals to get pillars, grains, minions, etc. After researching for only 10 minutes (which is pretty typical for me before I just start playing) I gave up and just started writing code. The best option seemed to be writing my own ‘salt-runner’.

My initial goal was just to write a simple test for http response on the resulting cluster for each website. Here is the salt-runner code /srv/salt/runners/test_websites.py:

# Import salt modules
import salt.client
import requests

def responding(tgt, outputter=None):
    local = salt.client.LocalClient()
    pillar = local.cmd(tgt, 'pillar.items')
    websites = pillar[tgt]['websites']    
    lb_ip = pillar[tgt]['lb']
    results = {}
    for website, data in websites.iteritems():
        if website != 'default':           
            headers = {
                'Host': website,
                'User-Agent': 'Salt Testing Agent'
            }
            r = requests.get('https://{0}'.format(lb_ip), headers=headers, verify=False)
            _results = {
                'status_code': r.status_code
            }
            results[website] = _results
    return {'outputter': outputter, 'data': results}

This goes in your runners_dir (for me that’s /srv/salt/runners] which is defined on your salt-master in your /etc/salt/master (or wherever your salt’s master configuration file is).

The example usage/output looks like:

# salt-run test_websites.responding web1.foo.fqdn.com
somesite1.com:
    ----------
    status_code:
        200
somesite2.com:
    ----------
    status_code:
        200
somesite3.com:
    ----------
    status_code:
        200

The script gets the pillar data from the ‘tgt’ specified on the command line, then makes an http request to the loadbalancer ip, passing in the host header for each site that should be responding.

This is specific to my case, but you can see how it’s a quick way for me to do “poor man’s functional testing” and ensure that specific aspect of the resulting state are actually being achieved.

I know this isn’t real functional testing in the true sense of the term. I’m doing this for now as a way to do regression testing against future bugs.

I hope that helps someone out there. If anyone has a better way to do this kind of thing I would love to hear it.

Generate hosts file on a cpanel server quickly

So here’s how to quickly generate a hosts file on a cPanel server, or rather how to print out a bunch of lines of text that can be appended to someone’s local “hosts file”:

#!/bin/bash
# set www="no" if you don't want www to be prepended to each entry.
www="yes" 
cd /var/cpanel/users 2>/dev/null && for userfile in *; do
	echo "#$userfile";
	ip=$(awk -F\= '$1 == "IP" {print $2}' "$userfile")
	domains=
	domain=$(egrep '^DNS([0-9]+)?\=' "$userfile" | awk -F\= '{print $2}')
	if [[ "$www" = "yes" ]]; then
		domain="${domain} www.${domain}"
	fi
	domains+=${domain}
	for domain in ${domains[@]}; do
		echo "$ip $domain"
	done
done || (echo "Error: Is this a cPanel server?"; exit 1) 

Quick one-liner you can paste directly on a server over ssh:

www="yes"; cd /var/cpanel/users 2>/dev/null && for userfile in *; do echo "#$userfile"; ip=$(awk -F\= '$1 == "IP" {print $2}' "$userfile"); domains=; domain=$(egrep '^DNS([0-9]+)?\=' "$userfile" | awk -F\= '{print $2}'); if [[ "$www" = "yes" ]]; then domain="${domain} www.${domain}"; fi; domains+=${domain}; for domain in ${domains[@]}; do echo "$ip $domain"; done; done || (echo "Error: Is this a cPanel server?"; exit 1)

Quick and dirty. Mostly useful if you are a sysadmin for a hosting company that uses cPanel, or you use cPanel and need to generate a “hosts file” quickly for testing a server migration.

I wonder if there is a better way to do this using cPanel’s API? I’ve never bothered looking into it. Might be a fun project for someone out there.

Cloudflare + DDoS

Quick post here. If you’re using something like Cloudflare for the sole purpose of mitigating a distributed denial of service attack (DDoS), please be smart with your DNS records.

I have a lot of respect for Cloudflare as a service. They use a combination of anycast + cache header injection + using their own Content Delivery Network (CDN) for static files (JS, CSS, images, etc).

If you are using this service as a frontend to mitigate a DDoS please mind your DNS records. Putting your site behind cloudflare, and changing your actual origin server ip could very easily mitigate a DDoS and keep your site online. However, it does no good if someone knows your “real ip” / “origin ip”.

How would the attacker be able to figure this out? Well, if you want to stop a hacker, start thinking like one. The first thing I would do is query common subdomain records for your domain.

for sub in mail cpanel dev staging real direct web1 web2 web3 db1 db2 db3; do
  dig +short "$sub".some-domain-behind-cloudflare.com
done

Oops! You may have forgotten to remove a DNS record that exposes the “real ip” / “origin ip”. So keep this in mind. Delete those records if you don’t use them. If you do need an A record that points directly to the server, consider handling this at the server level, or at least creating a less “guessable” subdomain, ie: oL7Vic67ZKvDDRMbHKRQ8Bk69HtchM4q.some-domain-behind-cloudflare.com that points to the real ip.

Security via obscurity is never the right solution, but it’s better than nothing sometimes.

Varnish Logging Per Host With Init Script

I’ve been hacking around with varnish a lot lately. If you don’t know what varnish is, it is an open-source, in-memory caching HTTP reverse proxy and really just an amazing piece of software. You can learn more here. Varnish, in keeping with it’s philosophy of being low-footprint and high-performance, does not by default log anything to slow disks. Instead it keeps an ephemeral log in memory. However, it does ship with a utility called varnishncsa which translates it’s in-memory log into a standard NCSA log format (common log format). You’re familiar with this format if you’ve ever looked at an apache or nginx access log.

The reason I’m interested in enabling these disk logs via varnishncsa is because I’ve also been playing around with Logstash, Elasticsearch, and Kibana recently to centralize logs and create meaningful graphs from the data. The easiest way I’ve found to plug varnish data into the logstash-forwarder daemon is to enable varnishncsa and have logstash parse them.

By default, varnishncsa doesn’t automatically separate logs into individual vhosts. This is a problem right away if you have multiple domains on one server, but if you think about it, it wouldn’t make sense for them to try. Both varnish and varnishncsa are highly configurable, so we can make them do whatever we want. So let’s get to it.

Here’s what we want.

1) Individual log files for each vhost.
2) We want this logging mechanism to be started and stopped via a standard sysv init script.
3) Bonus points: Automate our configuration.
4) Super bonus points: Create an upstart script? I haven’t gotten to this yet but I will.*

The best way I found to do this is to spawn an individual varnishncsa process per virtual host, and specify which Host header to look for in the request, using a Varnish Shared Memory Language Query (VSL), using the -q option. For instance:

varnishncsa -a -D \
    -q 'ReqHeader:Host ~ "^(www\.)?kevops.com$"' \
    -w /var/log/varnish/kevops.com.access.log \
    -P /var/run/varnishncsa.kevops.com.pid \
    -F '%{X-Forwarded-For}i %l %u %t "%r" %s %b "%{Referer}i" "%{User-agent}i"'

Most of these options are pretty self-explanatory. The most important one is -q which we’ll get to in a minute. But I want to first explain the last option (-F ). I’m specifying the log format I want explicitly because I need to modify the first field only. I am replacing the standard client ip with the value of a header for “X-Forwarded-For”. I’m doing this becuase I have an SSL terminator (pound) in front of varnish for HTTPS requests, and further upstream, a loadbalancer. Both of these inject/append to this header as necessary so that we know the original client ip. If these situations don’t apply to you, simply remove the entire -F option line above. (and the trailing slash in the preceeding line of course)

So let’s talk about that -q option:

  -q 'ReqHeader:Host ~ "^(www\.)kevops\.com$"'

This is VSL. It’s very intuitive, especially if you’ve spent some time hacking around with varnish’s domain-specific configuration language VCL, which I’m sure you have if you have been using and tuning Varnish. Since varnishncsa just uses the in-memory varnish log, it’s parsing behavior is very similar to the varnishlog utility. We’re taking all HTTP requests whose Host header was exactly kevops.com, or exactly www.kevops.com and writing those to a specific log file. You could use this -q option to write very specific log files if you wanted, based on whatever HTTP headers (or anything) you want. I haven’t gotten that creative yet, or seen a need. But you certainly could.

Anyway that’s one virtual host, we’ve daemonized (backgrounded it) with the -D option, and it’s logging requests. Awesome. But what if we have 15 different vhosts on this server and we want to log all of them to individual files?

Here’s what I did. This might seem a bit hacky, but it’s the right way to use the tools we’re given in this case, in my opinion. First, create a wrapper script. Let’s put it somewhere in our $PATH, like /usr/local/bin/varnishncsa-wrapper

#!/bin/bash
# Wrapper script for varnishncsa per vhost
# Invoked by our sysv init script /etc/init.d/varnish-logger

varnishncsa -a -D \
    -q 'ReqHeader:Host ~ "^(www\.)?example\.com$"' \
    -w /var/log/varnish/example.com.access.log \
    -P /var/run/varnishncsa.example.com.pid \

varnishncsa -a -D \
    -q 'ReqHeader:Host ~ "^(www\.)?example2\.com$"' \
    -w /var/log/varnish/example2.com.access.log \
    -P /var/run/varnishncsa.example2.com.pid \

varnishncsa -a -D \
    -q 'ReqHeader:Host ~ "^(www\.)?example3\.com$"' \
    -w /var/log/varnish/example3.com.access.log \
    -P /var/run/varnishncsa.example3.com.pid \

# and etc, for each virtual host...

That’s already tedious, so we’ll need to automate this obviously. We’ll get to that later. If you have this script created for all or some of your virtual hosts, next we need to create our sysv init script, /etc/init.d/varnish-logger

#!/bin/sh
# Simple init script for starting/stopping logging vhosts via varnishncsa processes

### BEGIN INIT INFO
# Provides:          varnishncsa-wrapper
# Required-Start:    $local_fs $remote_fs $network varnish
# Required-Stop:     $local_fs $remote_fs $network varnish
# Default-Start:     2 3 4 5
# Default-Stop:      0 1 6
# Short-Description: Vhosts wrapper script to spawn varnishncsa daemons
# Description:       This script provides logging for varnish in NCSA format for all vhosts
### END INIT INFO

DAEMON="/usr/local/bin/varnishncsa-wrapper"

method="$1"

do_stop()
{
    killall -9 varnishncsa >/dev/null 2>&1
    ps auxf | grep varnishncsa | grep -v grep > /dev/null 2>&1 || return 0
    return 1
}

do_restart()
{
    if do_stop; then
        ${DAEMON} && return 0
    fi
    return 1
}

case "$method" in
    *start)
        do_restart && echo "[OK]" && exit 0
        echo "[FAIL]" && exit 1
        ;;
    stop)
        do_stop && echo "[OK]" && exit 0
        echo "[FAIL]" && exit 1
        ;;
    status)
        pgrep varnishncsa > /dev/null 2>&1 && echo "Running..." && exit 0
        echo "Not running..." && exit 1;
        ;;
esac

Yep, it’s very dumb and very hacky. I’m aware. I don’t care. All it needs to do is start and stop logging for varnish HTTP requests and separate requests that match valid Hosts. And this does the job. Improvements will come later.

Speaking of improvements, let’s automate the creation of the wrapper script. So this gets kind of weird. If you are using configuration management tools like salt, puppet, chef, ansible, then I don’t need to tell you how to automate this. You are probably already writing a state, recipe, manifest, or playbook, and making what I did better, and more suited for your needs. I am using Salt right now, so building the wrapper script is trivial.

It looks something like:

#!/bin/bash
# Simple wrapper for varnishncsa to individually log vhosts 
# Generated by SaltStack

{% for website in websites %}
# http
echo "Starting varnish NCSA logging for HTTP traffic /var/log/varnish/{{ website }}.access.log ..."
varnishncsa -a -D \
    -q 'ReqHeader:Host ~ "^(www\.)?{{ website }}$"' \
    -w /var/log/varnish/{{ website }}.access.log \
    -P /var/run/varnishncsa.{{ website }}.pid \
    -F '%{X-Forwarded-For}i %l %u %t "%r" %s %b "%{Referer}i" "%{User-agent}i"'
{% endfor %}

If you are still reading this, you are:

1) Writing some bash script that creates the wrapper script for each host and you’ve gotten what you need from me and are happy.
2) You’ve digested it and are taking bits of it for your own use with configuration management tools of your choice.
3) You’re absolutely insane to have read this far.

* – I couldn’t figure this out. I tried for 15 minutes and gave up since system works with both types for now.

Bound Method Python

This is a mini post that only exists to hopefully help out people like me who are dumb sometimes.

If you are pulling your hair out trying to figure out why you are getting something like this:

<bound method MyClass.my_method of <__main__.MyClass instance at 0x7f8f2cac1710>>

Then you are probably trying to do something like this:

MyClass = MyClass()
print MyClass.my_method

You should be doing:

MyClass = MyClass()
print MyClass.my_method()

In the former code block, python was printing exactly what you told it to. It’s a lot like Ruby, in that everything is an object. It will happily “print” a class method for you and even give you the location in memory of it! But that’s not what you wanted. You wanted it to execute the class method.

I hope this saves someone out there some unnecessary troubleshooting.

Adding Spotify current track to tmux status bar

I love tmux. I’m a very recent convert.

I just discovered/embraced tmux over screen, so I’ve been hacking my ~/.tmux.conf a lot in the last couple days. It’s strangely satisfying to make your status bar display remote and local content in clever ways. The possibilities are endless.

Spotify has also been a necessity lately at work and at home. I’m using the Spotify Linux Preview

Since I’m running debian + openbox based Crunchbang lately as my daily driver, I usually run Spotify in a dedicated workspace. Sometimes I hear a good track, and I want to know the Artist / Track, so I shift over to my spotify workspace, and see who it is. That’s tedious. Obviously this is a task for tmux.

So how do we put the current “Artist – Track” in our tmux status bar?

The first thing I thought was: Can I use the Spotify API? So I tried that. I couldn’t find an API call for what I needed. There may be a way to use the API to find what you are listening to at the moment, but I couldn’t find it. Even if there is a way, it seems inefficient. I’m running the Spotify client locally — so how can I see what track is playing?

While looking into this, I noticed something interesting:

kevin@ox:~$ sudo netstat -tnlp | grep -i spotify
tcp        0      0 127.0.0.1:4371          0.0.0.0:*               LISTEN      11227/spotify   
tcp        0      0 0.0.0.0:57621           0.0.0.0:*               LISTEN      11227/spotify   
tcp        0      0 127.0.0.1:4381          0.0.0.0:*               LISTEN      11227/spotify   
tcp        0      0 127.0.0.1:8099          0.0.0.0:*               LISTEN      11227/spotify   

So Spotify is acting as a server from my machine. Hmm..

This guy Carl already figured this out for us. Check out his awesome post about this. So how can we talk to this local Spotify server and snag info?

Luckily for us, this other guy, nadavbar, based on Carl’s findings, took it a step further with a nodejs module. Awesome.

So this is now SUPER easy. We’re going to use nadavbar’s nodejs spotify-webhelper module to get our current track.

Create the following script somewhere, like ~/scripts/tmux/spotify-get-current-track.js

var nodeSpotifyWebHelper = require('node-spotify-webhelper');
var spotify = new nodeSpotifyWebHelper.SpotifyWebHelper({port: '4371'});

// get the name of the song which is currently playing
spotify.getStatus(function (err, res) {
  if (err) {
    return console.error(err);
  }

  var song = res.track.artist_resource.name + ' - ' + res.track.track_resource.name;
  console.log(song.substring(0, 32));
});

You need to install nodejs. See this

Add the module:

npm install node-spotify-webhelper

Note that I’m using substring(0, 32) to limit the Artist – Track to 32 characters. This is so extra long artist/track names don’t break our tmux status bar.

You may also have to change the port in the script above. Check the output of: netstat -tnlp

Okay. Let’s test the script:

node ~/scripts/tmux/spotify-get-current-track.js

You should have gotten “Artist – Track” from that, assuming you’re running spotify currently. So let’s add it to our status bar in tmux.

Edit your ~/.tmux.conf :

set -g status-right '#[bg=black]#[fg=white]#  ♫ #[fg=green]#(node /home/kevin/scripts/tmux/spotify-get-current-track.js'

Now we have something like this (I have some other things in my status-right in the image below — Ubersmith tickets, WHMCS tickets, Exchange server emails, Laptop battery status, etc.. maybe I’ll make another post about those):

tmux-kq-ex

Control Panels, Cross Site Request Forgery, and Case 74889

The rise of web hosting control panels has changed the landscape of the web hosting industry dramatically. They reduce the barrier to entry for server administration by automating configuration and management tasks within a web-based GUI. Before this, server administrators had to configure their systems by hand, or through a suite of their own custom shell scripts and such.

While this has allowed more people to enter into the business and become resellers and administrators, it has its drawbacks. Some would argue that this has been damaging to the industry as it’s ushered in a wave of administrators who aren’t qualified enough or knowledgable enough to properly manage a server. In the right hands though, these panels can actually be a big help to both camps.

plesk-logoThat argument aside, something has never felt right to me about a web interface that you log into as root to run commands. It’s essentially a root kit with a pretty front-end. That said, there are protections in place to prevent exploitation, and cPanel in particular has a great security track record, and their security team from my experience was responsive, and a pleasure to work with.

cpanel-logo
One problem that I don’t think gets enough attention with these control panels like cPanel, Plesk, Webmin, DirectAdmin, and others, is the possibility for self-induced Cross Site Request Forgery, for lack of a better term.

Consider this hypothetical attack:

1) Compromise one or more vulnerable sites on the server (perhaps an outdated Joomla or WordPress site), and inject code like the following:

<?php 

$ref=$_SERVER['HTTP_REFERER'];
if( preg_match('/https?:\/\/.*\/sess_token[0-9]{10}/', $ref, $matches) ) {
	$url=$matches[0]."/api/create_reseller_account.php?resources=unlimited";
        echo "<script> window.open('$url','_blank'); </script>";
}

2) Wait for a systems administrator, logged into the panel as root to click on the URL link for the hacked domain from the control panel. The above would have the sysadmin unknowngly create an account with full privileges. It doesn’t take much creativity to do far worse, like adding a public ssh key through the panel, changing the root password, or anything else they are allowed to do.

The CSRF session token, designed to prevent this type of attack will be useless because the administrator just provided it to the hacked site via the referer (if their browser is configured to pass referers, which is the default in most browsers). The method above creates a new tab in the victim administrator’s browser to make the malicious control panel request, but you could easily make this action less transparent. Note also that my regular experession matches an optional https, but I believe this attack will only work if the administrator is using non-secure access to the panel. * – See #2 below.

I submitted a very similar, and working POC to cPanel’s security team, and they corrected the issue ( Case 74889 ) within a few weeks. I am willing to bet however that it’s a problem in other panels like Plesk, and perhaps other parts of cPanel that provide URL links to sites on the server. In this case, the POC was very similar to the code above, and it the vulnerability was in URLs in the “Manage SSL Hosts” section of the panel. They corrected it by cleansing the requests before redirecting you to the destination domain so as to remove the referer.

The reason I call it a self-induced CSRF is because it’s not exactly a plain-vanilla CSRF attack. Typically you would post a malicious link to a remote forum, chat, or email it to them, hoping they already has an active session at some other site. For example:

<a href="http://192.168.0.1/my-router/reset-router-password.php">Funny Cat Gifs LOL</a>

The above could be easily mitigated if the router’s admin panel used a randomized csrf token in the url.

However, in the case of the control panel attack, it bypasses any CSRF protections because the malicious link is clicked from the same origin being exploited, and the token is provided to us free of charge. If there’s a correct name for this type of attack, I don’t know it.

The moral of the story is:

1) Server Admin Control Panels will never really be secure.
2) * – Always use https if you have to use them. This is because browsers will IIRC, never pass a referer if the scheme changes from https to http, OR if the scheme remains https to https, but the origin changes.
3) Just don’t click on any remote urls from within a control panel if you don’t have to. Copy/paste to a new tab, or ‘middle click’ to open a new tab. (referer should not send this way)

Advanced Troubleshooting with Strace

Sometimes a site is performing erratically, or loading slowly and it’s not evident what the problem is. When you’ve run out of standard troubleshooting methods, it might be time to go deeper.

We need to go deeper.
We need to go deeper.
One way to do that is with a tool called strace. Strace allows you to track the system calls to the kernel in real time.

You can pass it a process id, or run it in front of a command.

Quick example:

Let’s use the -e trace option to tell strace what type of system call we’re interested in. We want to see what files it’s opening. We have a suspicion that running the host command will attempt to check our /etc/resolv.conf before querying the internet for an A record, so let’s verify that.

$ strace -e trace=open host google.com 2>&1 | grep resolv.conf
open("/etc/resolv.conf", O_RDONLY|O_LARGEFILE) = 6

As we expected, it does make an attempt to open that file.

Note that I redirected STDERR to STDOUT so I could grep the output. strace writes its output to STDERR. I won’t go into too much more detail about strace for now, but you get the idea.

Now back to our hypothetical slow or erratic website issue. The first step to troubleshooting an issue is duplicating the problem. The second step is making it repeatable. The third step is isolating the problem so you can pick it apart and examine it. When dealing with a busy webserver, the problem with doing that last step is that you don’t know which apache PID is serving you, so you can’t very well isolate it if you don’t know which one to iolate.

There are some hacky workarounds for isolating the apache process id that’s serving your HTTP requests. You can telnet to the server from the server, and find the pid via lsof, or netstat:

$ telnet localhost 80
GET / HTTP/1.1
Host: slow-domain.com

Then open another screen on the server, and find your telnet pid with netstat:

$ netstat -tapn

[..snip..]
tcp6       0      0 127.0.0.1:80            127.0.0.1:40402         ESTABLISHED 20008/apache2            
tcp        0      0 127.0.0.1:40402         127.0.0.1:80            ESTABLISHED 23955/telnet 

From this we know that process id 20008 is serving my telnet request because the remote and destination ports match. Then you can strace that PID, and quickly give your HTTP request in your telnet session the final carriage return to send the request. But this is clunky, and has race condition issues, and frankly it’s hard to get right.

But there is a better way. You can launch another instance of apache on different ports, say 81, and 444 (for https). Set the MaxClients value to 1, so only you can access it, then add an iptables rule to only allow your remote ip to access those destination ports.

Here’s an example of how you can do this on a cPanel server. Keep in mind, you may not need to copy everything like I did, but I just wanted to make sure I had an exact replica running on the alternate ports. You might want to exclude large log files and such if your apache diretory is large.

Clone the apache directory in full (binaries, conf, everything)

cp -r /usr/local/apache /usr/local/apache-tmp

Change ports for http and https so we can run ours without affecting the regular apache
$ cd /usr/local/apache-tmp
$ find . -type f -exec sed -i 's/:80/:81/g' {} \;
$ find . -type f -exec sed -i 's/:443/:444/g' {} \;

Only allow one maxclient, so we can find the apache pid serving us when we hit the site
$ find . -type f -exec sed -i 's/MaxClients.*/MaxClients\ 1/g' {} \;

Modify all absolute path references to the normal apache dir to our cloned one
$ find . -type f -exec sed -i 's/\/usr\/local\/apache/\/usr\/local\/apache-tmp\//g' {} \;

Now we can start our cloned apache on alternate ports 81,444 with just one maxclient allowed. You should then be able to access every site on the server via the alternate ports.
$ httpd -d /usr/local/apache-tmp/ -f /usr/local/apache-tmp/conf/httpd.conf

That launched the root httpd process with one child pid as expected
Now find the CHILD pid, try:
$ ps auxf | grep apache-tmp

Now to attach strace to the one and only apache process.
$ strace -p PID_HERE -f -s 2048

The -f option tells strace to follow child processes.
The -s option specifies how many bytes of each call to capture. 2048 might be overkill, so feel free to adjust this.

Then make the http request:

$ curl slowsite.com:81/badcode.php

This is definitely a drastic troubleshooting method, but it’s great for those times when you hit a brick wall diagnosing a slow-loading, or erratic behaving site and feel compelled to find find the issue.

Note: cPanel changes directory structure with updates from time to time. This was done a few months ago on a cpanel 11.40 build I believe. YMMV, use this tactic with caution.

Using CasperJS to Automate Server Migration Testing

casperjs-logo-dark

CasperJs is an open source navigation scripting & testing utility written in Javascript by Nicolas Perriault for the PhantomJS WebKit headless browser. It also works with Gecko-based SlimerJS as an alternative engine.

What does all of that mean? It means you can emulate navigation steps just as you would in a browser — without the browser.

I’ve played with PhantomJS in the past out of curiosity, but never thought it would become an important tool in my sysadmin toolbox. Well, it is now.

I was recently tasked with migrating sugarCRM instances. There are enough of them that manually testing each login post-migration was not something I was looking forward to. It’s fairly trivial to do this with cURL, but I wanted to try taking this a step further. I wanted to take a screenshot of the page after logging in, and save it to a file — all automagically. Enter CasperJS.

First, we create a new Casper instance. Setting verbose and debug logLevel is very useful for testing, but it’s optional. Notice there is a built-in test framework as well!

phantom.casperTest = true;
require("utils");

var casper = require('casper').create({
	verbose: true, 
	logLevel: 'debug',
	pageSettings: {
		userAgent: 'Mozilla/5.0 (X11; Linux i686; rv:24.0) Gecko/20140611 Firefox/24.0 Iceweasel/24.6.0'
	}
});

Next, since I’m a sysadmin and linux junkie, I live in the command-line. So I’m using the CLI options parsing features built right in to CasperJS. I think the below is pretty self-explanatory and the options parsing just works:

Sample CLI usage:

kevin@kevops:~/$ casperjs ./sugarcrm-login.js --host="some-sugarcrm-site.com" --user="admin" --pass="p4ssw0rd" --ssl --imgdir="/home/kevin/screenshots/"

var host 		= casper.cli.get('host');
var user_name 		= casper.cli.get('user');
var user_password 	= casper.cli.get('pass');
var scheme		= 'http://';
var imgdir		= '/tmp/';

if(casper.cli.has('ssl')) { var scheme = 'https://'; }
if(casper.cli.has('imgdir')) { var imgdir = casper.cli.get('imgdir'); }

var base_uri = scheme + host;

Add some some event listeners. The first one allows us to locally print console.log messages. The second one emits when a page leaves a Javascript error uncaught.
casper.on('remote.message', function(msg) {
    this.echo('remote message caught: ' + msg);
});

casper.on("page.error", function(msg, trace) {
    this.echo("Page Error: " + msg, "ERROR");
});

Time to fire it up. One of the great things that casperJS adds to something like phantomJS is the ability to write simple code in a procedural way. This is important when you’re testing page navigation for a website, and avoids some of the headache from javascript’s asynchronous nature. The first thing we do is call casper.start() . This is what first loads the url.

Note that I’m trying out the test framework here, just to see how it works. It’s simple and intuitive if you’re familiar with testing frameworks.

casper.start(base_uri, function() {
	this.test.assertExists('form#form', 'form found!');
});

We determined that the form exists with our test statement, so now we need to fill in the fields and submit. Keep in mind, .fill() is looking for the “name” property of the form fields, which is user_name nand user_password in the case of sugarCRM logins.
casper.then(function() {
	this.fill('form#form', {
		'user_name':		user_name,
		'user_password':	user_password,
	}, true);
});

How does it work?
magic
So that filled out the form, submitted it, and went to the next step, just like you would using a browser. Is it really that simple? Yup.

Even more magical, we can take a screenshot once we login, and save it. How cool is that?

// login and grab snapshot
casper.then(function() {
	casper.viewport(1024, 768);
	this.capture(imgdir + host + '_login.jpg', undefined, {
        	quality: 100
	});
});

casper.run();

casper.run() is the final call, that kicks off the whole thing.

So now, since these sugarCRM sites all share a superadmin login and password, I can do something like this post migration:

#!/bin/bash
grep -i servername /etc/httpd/conf/httpd.conf | awk '{print $2}' | \
while IFS= read -r domain; do
  casperjs ./sugarcrm-login.js --host="$domain" --admin="admin" --pass="p4ssw0rd" --imgdir="/home/kevin/tmp/sugar-migrations/screenshots/"
done

Of course, I think I’ll add a lot more testing, and more commandline options to test if both http and https are working, and probably click through the admin panel for more thorough post-migration testing. I think I’d also like to create separate log files for each domain, but I’m not sure yet.

My next casperJS project will be to automate my daily work clock-ins. We have to login and logout of a web portal each day at work. Automating this each day with a screenshot is a perfect use case, and I’ll have meticulous records in case there’s ever an attendance discrepancy. 😎

sugarcrm-login.js:

phantom.casperTest = true;
require("utils");

var casper = require('casper').create({
	verbose: true, 
	logLevel: 'debug',
	pageSettings: {
		userAgent: 'Mozilla/5.0 (X11; Linux i686; rv:24.0) Gecko/20140611 Firefox/24.0 Iceweasel/24.6.0'
	}
});

var host 		= casper.cli.get('host');
var user_name 		= casper.cli.get('user');
var user_password 	= casper.cli.get('pass');
var scheme		= 'http://';
var imgdir		= '/tmp/';

if(casper.cli.has('ssl')) { var scheme = 'https://'; }
if(casper.cli.has('imgdir')) { var imgdir = casper.cli.get('imgdir'); }

var base_uri = scheme + host;

casper.on('remote.message', function(msg) {
    this.echo('remote message caught: ' + msg);
});

casper.on("page.error", function(msg, trace) {
    this.echo("Page Error: " + msg, "ERROR");
});

casper.start(base_uri, function() {
	this.test.assertExists('form#form', 'form found!');
});

casper.then(function() {
	this.fill('form#form', {
		'user_name':		user_name,
		'user_password':	user_password,
	}, true);
});

// login and grab snapshot
casper.then(function() {
	casper.viewport(1024, 768);
	this.capture(imgdir + host + '_login.jpg', undefined, {
        	quality: 100
	});
});

casper.run();