tech stuff from a tech bloke

iplayer-androidThe BBC iPlayer app for Android now almost has feature parity with the iOS version, but still falls short in a few areas:

  • Only supported devices get offline downloads
  • The player seems to have audio sync issues
  • The player lacks any decent controls (like gestures for volume, seek, brightness)
  • Downloads expire before you get a chance to watch them

To get around these restrictions I’ve been downloading content using get-iplayer on my Ubuntu server (an old laptop which never gets turned off), and then syncing the video files to my Nexus 7. Then I can use any player I want to watch it, I like to use MX Player.  The download and sync happens automatically once a week, so when I get my N7 out on the train on a Monday morning I have fresh content ready to be watched. If you’re interested in how to set this up, read on.  Continue reading

We are massive fans of Splunk at work. As a tool for centralised log management it gets the job done, but its much more than that. If your logs are laid out nicely (which is largely down to you), you can quite easily draw out snazzy graphs and charts composed of data from hundreds of servers, and beyond that, set up alerting for n number of events, or for values above certain alert thresholds, as an example to alert you to long-running http requests.

In the open source world, tools with similar features exist and are absolutely free. For the Splunk forwarder kind of role, you can use logstash. For the Splunk indexer component, the open source equivalent is Elasticsearch, and for the front end you can either use Kibana or Greylog2. There seems to be some buzz around Kibana more than Greylog these days, but it probably depends where you read. By the way, you can use logstash as an all-in-one solution if you only have one server and fairly straightforward needs.

I configured logstash-forwarder on my vps to ship logs over SSL to logstash listening on my server at home, and added elasticsearch and kibana on top. It took a couple of hours to get everything working, there are loads of good guides out there and the docs are all really good. I left it running for a few hours and came back to look at my apache access logs in Kibana. This was my first query, which shows 200 status codes to this blog:

This is what I love about graphical utilities, straight away you can see something was hitting the site harder at 2:30 putting the graph out of trend. A few clicks later it was obvious from the hundreds of page views on the  wp-login page from a single IP that someone was trying to brute force their way into the site using a script.  A few minutes later I had installed the Limit Login Attempts plugin for WordPress, so I’ll be better protected against brute force attacks in the future.

So there you have it, a picture tells a thousand words. I’ve only scratched the surface as to what can be done with logstash/elasticsearch/kibana, but already I am loving it.

Testing wordpress social integration.

I’ve been having a bit of fun this week Apache rewrite rules based on cookie values. The problem I had to fix was unintended redirects because of mod_rewrite’s greedy pattern matching on  %{HTTP_COOKIE}. What we have is 2 cookies set:

Based on these cookie values exactly matching xx and yy, we will redirect the customer to a specific page on the site.

However some new cookies with similar names were introduced which happened to have the exact same values as foo and bar:

What we noticed was the existing rewrite rules were pattern matching both 1_foo OR foo with value xx, and bar OR 1_bar with value yy. What I needed to do here is amend the rewrite rules to exact match cookies foo and bar to avoid the unintended redirects.

Matching a cookie value in mod_rewrite with a regex is pretty trivial and there are loads of examples on doing that, but matching the exact cookie name is not as obvious. The way that mod_rewrite parses the HTTP_COOKIE header is as one long string rather than as a list of key/value pairs, and if you use Chrome Developer Tools (or similar plugins for Firefox) you’ll see Cookie: in the request header, with your cookie names and values separated by semicolons. The following rules do work and overcame this issue:

This is matching ;foo and ;bar within the http_cookie header, OR either foo or bar at the beginning of the http_cookie header.

We’ve recently implemented Redis for a few things at work and wanted some Splunk based monitoring and alerting on Redis health. The below script will run the redis “info” command using redis-cli and inject the data into your log file. Just change the path at the top to the directory containing your conf files.

# import redis info into the log file in a splunk friendly format

for config in /your/path/startup*redis*conf; do

 port=$(grep port ${config} | awk '{print $2}')
 logfile=$(grep logfile ${config} | awk '{print $2}')
 pidfile=$(grep pidfile ${config} | awk '{print $2}')
 date_time=$(date +%d %b %H:%M:%S.%N | sed 's/......$//g')

 if [ -f ${redis-info} ]; then rm -f ${redis-info}; fi
 if [ -f $pidfile ]; then

  pid="$(cat ${pidfile})"
  redis-cli -p ${port} info > ${redis_info}

  ## convert CRLF line terminators to ASCII
  dos2unix --quiet ${redis_info}

  logevent=$(for line in $(grep ':' ${redis_info} | tr ' ' '_' | sed s'/:/="/'g); do echo -n "${line}" "; done)
  echo "[${pid}] ${date_time} redis-info ${logevent}" >> $logfile
  rm -f ${redis_info}


Run this via cron every 5 minutes.

This allows you to then graph things like memory usage on your redis masters in Splunk (using something like search redis-info role=”master” | timechart avg(used_memory) by source span=5m), and of course set alerts when certain thresholds are crossed. 

We had a requirement to add the url scheme (http or https) into our Apache access logs, this took a little research and experimentation as it’s not something offered by default in as an option in the Log Format directives; if you try adding %H for ‘request protocol’, you’ll actually end up with “HTTP/1.1″ in the logs instead of what you might have expected. There are a few solutions to this but this is how I did it, by setting an environment variable called %{SCHEME} using rewrite rules:

RewriteCond %{HTTPS} !=on
RewriteRule ^(.*)$ - [E=SCHEME:HTTP]
RewriteCond %{HTTPS} =on
RewriteRule ^(.*)$ - [E=SCHEME:HTTPS]

After that just amend your LogFormat definition to include url_scheme=”%{SCHEME}e”.

In a typical Apache / Tomcat configuration with mod_jk, outside of keeping your software stack updated with the latest versions there are a few easy steps you can take to help protect yourself against basic scripting attacks.

1. Disable any Apache modules you are not using.

Usually by default everything is enabled. You can disable unnecessary modules by commenting the LoadModule lines in your httpd.conf which refer to the unwanted modules, and then reload Apache. In particular, disable mod_cgi if you don’t need it since it is a popular attack vector. On Ubuntu, there are helper scripts: a2enmod and a2dismod, these add and remove symlinks to the modules in your modules directory which essentially do the same thing as commenting/uncommenting the LoadModule lines.

2. Obfuscate your Server header.

An attacker that doesn’t know what webserver or app server you are running is far less likely to be successful in attacking you. The default behaviour in both Apache and Tomcat is to advertise the full name and versions, and there is no need to reveal this information to the big bad Internet.

In a large organisation the topology will probably be many Tomcat instances (maybe hundreds), few Apache servers (probably less than 10), and a single load balancer in front of the Apache servers. So logically, it should take the least amount of administrative effort to rewrite the Server header in the http response as it passes through the load balancer. This could be done on a Netscaler using a lobal rewrite policy as follows:

add rewrite action RW_ACT_ServerHeader replace "http.RES.HEADER("Server")" "" Web Server""
add rewrite policy RW_POL_ServerHeader "http.REQ.HOSTNAME.CONTAINS("yourdomain")" RW_ACT_ServerHeader
bind rewrite global RW_POL_ServerHeader 10 -type REQ_OVERRIDE

Of course this configuration varies wildly between load balancer manufacturers who seem to all like having their own unique syntax and terminology for things. If you don’t manage your load balancer you could make some changes to Apache described here to change your header to read “Apache”, which is somewhat better.

Using mod_security you can rewrite the Server header to anything you want. Keep in mind though that a really determined hacker will use other tricks to discover your flavour of webserver.

3. Catch bogus requests at the Apache layer and 404 them, instead of letting Tomcat deal with it.

In our logs at work we often see lame hacking attempts consisting of many thousands of requests per hour for nonexistent URLs like:


Our webapp has the ability to handle 404 errors and render a pretty page in the correct language with navigation back to the important parts of the site, but in the case of these brute force URL attacks we don’t want to waste CPU cycles on the Tomcat server rendering a nice 404 page. Instead we will catch them on the Apache layer and display a static 404 error page so that Tomcat can carry on with serving the important traffic.

Creating a custom 404 error page with your own branding on it is optional, but nice in case a real customer ever does end up on one, its also another way to disguise that you’re running Apache underneath. So the first step is creating the static 404 error page if you don’t have one, and dropping it in your DocumentRoot somewhere like /error_pages/404.html, then configure Apache so that it can serve this file. First in the httpd.conf:

ErrorDocument 404 /error_pages/404.html

mod_jk also needs to know that Apache should serve the file, so in your jk unmount it:

!/error_pages/*.html = lb

Once you have reloaded your Apache configuration, this will then become your Apache server’s default 404 error page. The last step is to add a rewrite rule to cover some of the common file extensions that hackers look for. You can do this using a RewriteRule sending the requests to the static 404 page, however this will result in a 200 status code. It’s better for your log analysis if these are correctly logged as 404s, and that can be done with RedirectMatch (which is part of mod_alias, rather than mod_rewrite), since you’re not allowed to use [R=404] in a RewriteRule … Apache will just ignore the line if it’s not a 3xx statuscode:

If the status code is outside the redirect range (300-399), then the Substitution string is dropped and rewriting is stopped as if the L flag was used.

The below RedirectMatch will catch URLs ending in .pl, .php, .exe, .sh etc …. with or without a query string afterwards, and send them to the Apache 404 ErrorDocument:

RedirectMatch 404 ^(.*).(pl|php|exe|sh|dll|bat|py|shtml|cgi)(?.*)?$ 

You may also want to add asx, asmx, and any other types you have never used and never intend to use.

4. If you are not using cgi, disable it.

This was mentioned above with disabling unused modules. At work our Apache logs show we occasionally get bombarded with bogus /cgi-bin/ requests, so we have another rule in place to catch these http requests and send them to the static 404 error page:

RedirectMatch 404 /cgi-bin/(.*)

5. Other clever stuff

Defensive coding is the best thing you can have! By that I mean an application which has sane validation of all input fields, and will safely ignore requests outside of known boundaries without throwing an error 500 or behaving in an unexpected way. Of course, its very difficult for developers to think of all scenarios.

Application firewalls are a good way to go, the obvious one being mod_security for Apache, though if you are going the mod_security route be prepared for a lot of heavy reading for a correct implementation.

Patching known vulnerabilities seems an obvious one to mention, however you may not realise you are vulnerable without some kind of regular penetration testing. If you don’t use a third party service for security scans, why not run your own security scans with Metasploit and work from there … after all that is probably exactly what your neighbourhood hacker is doing.

Here’s a little enhancement you can make to a tomcat startup script, which adds a quick and easy way to get a heap dump from a running instance. Adding this option makes it easy to then direct someone over the phone or over email to dump the heap and restart a tomcat.

 PID=$(/bin/cat $CATALINA_PID 2> /dev/null)
 if [ -z $PID ]; then echo "Error getting the PID, could not dump heap."; exit 1; fi
 echo -n "Trying to dump heap. "
 FILE=/var/tmp/$(basename $0)-${PID}-$(date +%d%m%Y).hprof
 $JAVA_HOME/bin/jmap -dump:file=${FILE} $PID
 if [ -f $FILE ]; then
 echo "Heap dump taken on $(hostname --short) for instance $INST: $FILE" | mail -s "Heap dump successful"
 echo "Usage: $0 [ start | stop | restart | heapdump ]"

Assumptions: that you have JAVA_HOME and CATALINA_PID defined in your init script, and that /var/tmp/ is a valid location, and that INST is the directory where your tomcat lives. It will also email an address with the location of the hprof file if successful.

Disclaimer:  jmap might not always work as expected if the JVM in question is completely fubared. If you wanted to, an alternative could be having the script dump the heap by invoking the dumpHeap operation in JMX, within the mBean. This could be done using a JMX command line client like jmxterm.

A downstream service that was being consumed at work had an expired SSL certificate and it caused complication for our application. The knee jerk reaction once the dust had settled was to make sure that everything was in order with our own certificates.

I wrote a script which uses the openssl tool to check a list of SSL certificates (in certs_to_check.txt) and output the details to a pipe delimited document, which is then imported into Confluence (wiki software) as a table format using their java CLI tools. I’ve also added some wiki markup in the output document which colorises the page, putting the status in red or green depending on the validity of the certificate. This then becomes a central place to check on the status of our certs, rather than having to remember where each certificate is installed, and assume that some alerts will fire from there when they are near expiry.

echo "||Certificate||Expiry date||Status||Days to expire||" > /usr/vchecker/results
for name in $(cat certs_to_check.txt); do
 openssl s_client -connect ${name}:443 > $cert <<EOD
 returncode=$(grep 'return code' $cert)
 if [ "$(echo ${returncode} | grep -c 'ok')" -lt 1 ]; then
 valid="{color:red}Not Valid{color}:${returncode}"
 expiry=$(openssl x509 -in ${cert} -noout -enddate | cut -d'=' -f2 | awk '{print $2 " " $1 " " $4}')

 # figure out number of days until the cert expires
 # convert expiry date to epoch time
 epochExpirydate=$(date -d"${expiry}" +%s)
 epochToday=$(date +%s)
 secondsToExpire=$(echo ${epochExpirydate} - ${epochToday} | bc)
 daysToExpire=$(echo "${secondsToExpire} / 60 / 60 / 24" | bc)
 echo "|${name}|${expiry}|${valid}|${daysToExpire}|" >> /usr/vchecker/results

The resulting wiki page looks something like this:

Example wiki page

Example of report uploaded to confluence

If you wanted to you could also add some alerting into the script, for example for certificates with less than 30 days to expiry:

daysToExpire=$(echo "${secondsToExpire} / 60 / 60 / 24" | bc)
if [ "${daysToExpire}" -lt "30" ]; then
 echo "Warning: SSL Certificate ${name} has ${daysToExpire} until expiry." | mail -s "SSL Certificate warning"

However in our case we are feeding the output file into our central monitoring and alerting system where the alerting is handled in a unified way.

Say you have a web application on your network running on a Linux box which is published on a high port like 8080, but you want users to access it on port 80. Also in this example, you can’t change the port on the application – this could be because it doesn’t run as root, or it is hardcoded, or you don’t have permissions to modify the config, or anything else.

There are a couple of ways to solve this problem. One could be iptables as a NAT router, described here. If you want to use this approach I would advise a bit of background reading on iptables first, as it’s quite easy to lock yourself out of your server if you forget to allow 22/SS, or related/established, or have some other config error or typo somewhere in your rules.

A more simple solution could be using Apache and ProxyPass (mod_proxy) and configuring it to proxy requests based on the hostname. On Red Hat you just need to yum install httpd and configure your site as follows. The mod_proxy module should be enabled by default. In Ubuntu just run a2enmod to enable the module.

<VirtualHost *:80>

	RewriteEngine On

        DocumentRoot /var/www

        ErrorLog /var/log/httpd/httpd-error.log
        CustomLog /var/log/httpd/httpd-access.log combined
	RewriteLog /var/log/httpd/httpd-rewrite.log

        ProxyPassMatch ^/$ !
        ProxyPassMatch ^/(.*)$1
        ProxyPassReverse /


You can also chuck a rewrite rule in there if you want users to land on a specific URI when they hit the document root.

1 2 3 12