It might be a sign that I spend too much time online but the quicker a system gives me feedback the more useful I find it. While I love knowing my Nagios safety net has me covered when making changes sometimes waiting for that cgi to refresh can take too long, especially if I’m taking a iterative / test driven approach to the changes I’m making. For those use cases I wrote nrpe-runner. Read on →


While searching for a completely different piece of software I stumbled on to the pigz application, a parallel implementation of gzip for modern multi-processor, multi-core machines. As some of our backups have a gzip step to conserve some space I decided to see if pigz could be useful in speeding them up. Using remarkably unscientific means (I just wanted to know if it’s worth further investigation) I ran a couple of sample compression runs. Read on →


I’ve never really liked make files, I don’t think I’ve ever had to write enough C to really appreciate (or just tolerate) them, so I was a little dismissive of Rake - and I was mostly wrong. Now we’re adding a new member to the systems team I’ve been doing a lot of thinking about our tool chain - what knowledge assumptions it makes, which parts are still more manual than I’d like and where the tool chain has gaps (this is the most annoying one for me) and rake seemed like a potential addition to encode some of that process knowledge in to a tool. Read on →

When it comes to Unix diagnostics I was raised the old fashion way, with iostat, vmstat and similar tools. However times change and tools evolve. dstat, while not as comprehensive as using all the tools one by one, provides a wide range of system performance details in an easy to use package. While it’s useful enough in its default state there is even more functionality lurking just below the surface. To see which other modules are available (but are not enabled by default) run dstat -M list. Read on →

I spent a little while digging through the default puppet log types the other day and after reading through a batch of activity logs I whipped up extract-report-issues, a script that can be run on the command line (or daily via cron) and displays a list of errors and warnings from the specified glob of hosts and log files. By default it does all hosts for the current day, we’ve got it running nightly so we can work through the issues each morning. Read on →

I like vim, I think it’s a great editor worth investing time and effort in to learning but I also think it’s one of the most horrible things to watch an inexperienced user typo his way through while you’re urgently waiting for them to finish the damn edit. My favourite one this week (and it’s only Tuesday) is looking for probably unique phrases that you can later search for to return to a specific part of a document. Read on →

Despite setting up my own gitweb install I’m still not using git regularly enough to be comfortable with it so today I went through the Peepcode Press Git Internals book/PDF. While the diagrams and details of what happens under the cover are useful it’s the wrong level for me as a basic user. To ease myself in to the move from subversion for some of my personal projects I found Git Magic to be more useful. Read on →

Adhoc changes are a very bad thing in many ways, one of the worst is how often they are not fully implemented across all the servers or even pulled back to staging. In an attempt to sanity check the config files when we have to make these little hacks I oddly-proudly present - rd-differ. A tool for diffing config files over multiple machines. The idea is simple, you tell it the file or directory you’re interested in, specify a single machine as the baseline and then specify a number of others as the machines to check against it. Read on →


You know what the best way to start the day is? I’m pretty sure that it doesn’t include a production web server putting its file systems in to read only mode. When this happens most local commands don’t work - init, shutdown, telnit and reboot all stop being useful and you have to resort to desperate measures… and here’s the desperate measure of the day. First, check that your system supports the magic sysreq key - $ cat /proc/sys/kernel/sysrq 1 # nonzero is good Now you know you have the power to destroy your system through a single incorrect character, have a look at the Redhat Sysrq command reference (you want the ‘sysrq’ section). Read on →

We’ve been hitting some load issues on one of our monitoring machines recently and while it looks like the munin graph generation is the culprit we also decided to keep an eye on how many services and hosts Nagios was checking. One of the downsides of having a very automated server deployment system is how easy it is to suddenly find yourself with an extra dozen hosts you no longer really need. Read on →

As part of my ongoing attempt to stop myself from silently making mistakes (I don’t so much mind the ones I notice) I’ve added another couple of Nagios Plugins. This time validate_feed and validate_html. As both of these checks call out to an external, third party resource, if you use them be sure to tweak your Nagios polling interval down to a respectful level.

While digging through a pile of syslog log files recently I needed something a little more data format aware than pure grep. So I present the first version of syslogslicer - a simple perl script that knows a little bit about the syslog log file format. # some example command lines syslogslicer -p cron -f program,message /var/log/syslog # print the program and message for all lines with program 'cron' syslogslicer -p cron -m hourly /var/log/syslog # all fields for all lines with program 'cron' and message 'hourly' syslogslicer -p cron -m hourly -s 20080810100000 -e 20080810123000 /var/log/syslog # all fields for all lines with program 'cron' and message 'hourly' # between 20080810100000 and 20080810123000 syslogslicer allows you to filter the output by matching text in the program or log message, only print certain output fields and do basic time based filtering. Read on →

“This script retrieves a URL via a specified proxy server and alerts (using the standard Nagios conventions) if the request fails.” We’re running a couple of services through a proxy server for a number of good, and to be honest a couple of not so good but mandated, reasons. The Check Proxy Check Nagios Plugin ensures that if the proxy goes down in a way that stops us pulling pages through it we know.

If you mount filesystems under a specific mount point, and monitor them with Nagios, then be sure you understand what happens if the underlying file system goes away. With: /usr/lib/nagios/plugins/check_disk -w 15% -c 10% -p /a_mount_point you’ll get the value from the containing file system. In this case /. If you’d rather know that your chosen mount point has actually gone away, and that you’re no longer checking what you thought you were, then add the -E option to the command. Read on →

We’ve recently had to deliberately disable some machines this week to ensure they can’t connect out to the internet - we’re building testing versions of some of our more restricted secure environments and this is one of the steps. It was actually easier to do with IPTables than I thought (mostly because I didn’t have to do it - my co-worker did) but once the work was done we needed to ensure it didn’t accidently get broken so that networking was functional again. Read on →

I’ve never really felt as proficient with apt and dpkg as I did with RPM. There always seems to be another option I’ve never seen before. Luckily there are also big holes in my knowledge of yum to make me feel well rounded. After reading yum options you may not know exist and spending a while puzzling out how to get the same results in Debian (apt-file seems to be the closest fit but I never got the invocation right) I decided to write dpkg-provides. Read on →

The title pretty much says it all, I’d like a command line version of YSlow! (what is it with Yahoo and !s) that I can run from cron and import in to a nice spreadsheet for trending and site comparisons. I don’t have XUL on my list of things to play with so I’ll give it a couple of months and watch someone else implement it. Hopefully.


The current trend with config files is to fill them with comments (let’s ignore the fact this isn’t a substitute for documentation) and while this is helpful watching people arrow through them line by line looking for active options drives me nuts. If you’re using vim (as all good people do ;)) you can jump from uncommented directive to uncommented directive with /^[^#] as a search. Pressing n will then move you to the next uncommented option. Read on →

Continuing the release of my Nagios code - here’s my Nagios Simple Trender. It parses Nagios logs and builds a horizontal barchart for host outages, service warnings and criticals. It’s nothing fancy (and the results are a little unpretty) but it does make the attention seeking services and hosts very easy to find. While the tool isn’t that technically complex I’ve found it useful in justifying my time on certain parts of the infrastructure. Read on →

We use the Nagios monitoring system at work (in fact we use four installs of it for physically isolated networks) and while it’s damn useful (and service checks are easy to create or extend) it’s a little lacking in higher level trending and visualisation tools. Well, at least the very old version we run suffers from this. Thankfully I work for a company that invests time in its core tools. Over the last couple of hackdays I’ve written two small scripts for parsing Nagios logfiles and presenting the information in a different, slightly more grouped way. Read on →

It’s not a well kept secret but I’m still surprised by how many people have never encountered .bash_logout. Its purpose is pretty simple, if you use the BASH shell it’ll be executed when you log out (see, a well named file!) So what’s it for? Well, I use mine to invalidate any sudo sessions I’ve got open (sudo -k), clear the screen -in case it’s a local session - and nuke a history file or two.

Sometimes questions come up that you know you should know the answer to but you just don’t. My recent one was “how does df choose the output order?” The man page doesn’t mention the logic behind it and a quick strace shows it pulls its data from /proc/mounts (which you’d expect) and returns the output in the same order. So logically the question becomes how does /proc/mounts order things? It’s not exactly an important question but I can see how this ends - and it involves source code.

I originally wrote frdns to find and warn about inconsistencies in forward and reverse DNS records. At the time I was also using a tool called hawk to show both IPs that didn’t have a reverse record and reverse records that didn’t have a responding IP address associated with them (we had a lot of orphaned records). While hawk did the job it required a MySQL instance, a daemon process and an apache server to function - which was a PITA when it had to be moved to another server. Read on →

When it comes to command line options GNU ls already uses most of the alphabet, so for my own sanity can someone implement a -j that doesn’t change the behaviour much from a ls -alh? It’s my most common typo and I’m willing to offer beer to remove the problem. I could learn to type better but this is easier ;)

I’m not too keen on yesterdays UGU tip of the day and it doesn’t take much to make it work a chunk better, so I thought I’d whine about it on my blog. Here’s the original snippet: grep -v "#" /etc/hosts | awk '{print $1}' | while read host do ping -c 1 $host done But this has some very fixable caveats. It doesn’t deal with blank lines, it’ll try and ping IPv6 addresses (and too many distros put IPv6 entries in the host table these days - even if you disable the IPv6 options) and it will ignore any lines that have a comment, even if the comment is after the field we want. Read on →

So now I’ve Announced PkgWatcher people are actually starting to use it, the optimistic curs! The first question’s already come in and it’s one I can actually answer: how do you extend it to work on other operating systems? It’s actually pretty easy, first you need to make an addition in installed_packages. This function works out which OS you’re running on and returns the respective subroutine that understands your package manager. Read on →

When it comes to servers, some packages should be everywhere, some should be banned and there are always the edge cases - be it a build host that requires GCC or a webserver that needs a full complement of packaged perl modules. While a decent system imaging or ad-hoc change system will help keep the discrepancies down nothing beats a system level check that verifies your assumptions. And PgkWatcher is that check. Read on →


You start off with a couple of partitions. You add a MySQL instance and put it on a new logical volume. You break its logging out to a different volume group for performance reasons. You take a snapshot for query tuning and mount that. You add a chunk of disk for a short experiment you were going to try… thanks to legacy, laziness and easy to use LUNs you eventually end up with more mount points than you know what to do with. Read on →

For my own use as much as anyone elses… One of the problems that’s haunted me at least once per company I’ve worked at as a tech is “the disappearing partition”. It’s there, it’s accessible, and it should be persistent across boots. But it isn’t! The machine reboots and then you discover that the database partition is no longer visible. The check mounted disks Nagios plugin looks at the mounted partitions and compares them to what’s in /etc/fstab (minus a couple of things like cd drives, floppy disks, swap partitions etc). Read on →

ps is an incredibly flexible command but it also has a checkered maintenance history in the Linux world. Yesterday I needed to output just the username, the command and any arguments passed to it. And it was hell. After reading through the man page a couple of times I settled on the following: ps -e -o user,args. But this doesn’t work. It shows the command and the full arguments but it trunks the username at 8 characters (which doesn’t help with things like exim on Debian - which has a username of Debian-exim). Read on →

A machine should run a defined set of ports, if any of them are not listening you’ve got a problem. If any others are open then you’ve potentially got an even bigger problem. The Check Open Ports Nagios Check accepts a list of IPv4 TCP and UDP ports and reports if any of the expected ones go away or any others are detected as listening. This also partially scratches one of my own itches, I’ve had a couple of daemons (MySQL in particular) start after a package upgrade without my knowing it. Read on →

I’ve recently needed a way to see, via the Nagios web front end, which Debian machines need their packages updating. So I wrote the check_debian_updates.sh Nagios plugin. This is the initial release (which hasn’t been hit too hard yet) so be careful about deploying it anywhere but your testing environment for now. I’ve played with it in my small test environment and it seems to work so feel free to have a look at it. Read on →

Tidy is a great little HTML lint tool, that goes a lot further than the W3C Validator, but it requires you to remember to run it. The FireFox HTML Validator extension uses tidy and the FireFox status bar at the bottom of the screen to show you tidy output from the current page. This extension removes the need to run tidy by hand, you get it for free on every page you visit, but it does mean you need to visit any pages you want to run tidy against once you get spoiled by its output. Read on →

Making a backup copy of a file is a pretty common thing to do (although you should be using RCS for a lot of these…). If you’re using a machine with a GUI then copy and pasting the file name twice, with an extension on the end, is pretty simple. If you’re either a keyboard jockey or without a mouse you can make your life easier with these two short cuts: # make a copy of file. Read on →

One of my guilty pleasures is reading through IRC quotes. I hate to think how much time I’ve spent reading my way through bash.org and qdb.us. While playing with Template::Extract today I found myself needing a simple, structured site to experiment with. And it resulted in the bash_quotes command line tool. The script is pretty simple, if you call it without an argument it gets the quotes from the “Latest” page. Read on →

The frdns.pl forward and reverse DNS checking script is one of those little mistake catchers that allow you to work with a safety net. In this case it checks that your deployed forward and reverse DNS records are present and correct; it checks the results from real DNS queries, not by zone file parsing. frdns.pl accepts a CIDR range and polls each IP for a reverse DNS record. If it gets one it’ll try to forward resolve the name and compare the two results. Read on →

I needed a command line tool to ping a number of CIDR network ranges, show me the status of each IP address and give me a return time for those that responded. I now have cidr_pinger.pl. It’s not as fast as a ‘nmap -sP blah/24’ but it does give me a return time. Although it only took ten minutes work with the ever incredible CPAN I’m putting it on here Read on →

Once you’ve been using a tool for a while you often reach a plateau where it’s “good enough” and you stop looking for ways to tweak it. I’ve been using bash for a number of years and I’ve got set in my ways; until I sat next to a co-worker who uses zsh. My first Linux machine had a 14” monitor that could only do low resolutions. Screen space was at a premium and every character was precious. Read on →

If you’re a heavy bash user you’ll often find yourself writing short snippets of code on the command line. Typically they’ll be based around a main loop and you’ll end up entering them over multiple lines to keep them readable. Unfortunately when you try reuse the command, by retrieving it from the bash command history, it’ll be transformed in to one semicolon laden unreadable mass. Unless you read on… One of the options bash allows you to set is ‘lithist’. Read on →

One of the lesser known features of bash is ‘$TMOUT’. When assigned a positive number this variable has two functions. When used in a script TMOUT is the timeout value for the ‘select’ command and the ‘read’ built-in. When used in an interactive shell, and assigned a positive number, $TMOUT is the number of seconds bash will wait (after outputting the prompt) before it terminates; typically killing the users session. This is often used to ensure that unused root prompts are not left logged in for more than a minute or two without auto-closing. Read on →

Google labs is one of the ‘Nets open secrets. It’s a site that gathers up some of Googles ideas for new sites and services and allows people to have a play with them. One of the services, Google Sets, has been quite useful to me recently. So I wrote the GoogleSets Command Line Interface. The basic premise (of both the site and script) is simple, you give it a list and it tries to expand it. Read on →

Source control is an essential part of a smart techies life. While the bigger version control systems are mostly useful to developers (SVK rocks) some of the simpler ones can often be found in the sysadmins toolkit. A couple of companies I’ve worked for have been heavy users of RCS on their servers and while it’s made configuration safer (and easier to revert) its lack of a central repository is often an unaddressed weakness. Read on →


As most of you ‘net savvy people know, the BBC has put a number of feeds online under the banner of BBC Backstage. While it’s nice to see organisations like the BBC offering this kind of data (and the front man, Ben Metcalfe seems a nice, and interesting guy) the initial release of one of the more interesting bits of data, the TV Anytime TV and radio data, only had a Java API available for using it. Read on →

I’ve made a couple of small changes to my Vim URL Shortener script. It now uses WWW::Shorten instead of WWW::MakeAShorterLink, it’s documentation has been tided up a bit and the vim script now replaces all occurrences of the selected URL in a single sweep. It’s not a major upgrade so don’t rush to update.

Ever wanted a tag cloud of your Blosxom posts? With just this blosxom-tagcloud.pl script (and three Perl modules from CPAN) you can have one that integrates itself with your Blosxom footer and even allows easy merging of the tag cloud and any static text/HTML you’ve used in the past. I’ve uploaded the initial version of the code and I’ve put up a Blosxom TagCloud page with some more information.

I’ve been having a fiddle with the Geo::Google perl module today, the simple explanation is that the module performs geographical queries using Google Maps. And it works well. Just it’s not very accurate… Geo::Google takes an address and returns a longitude and latitude from Google Maps. With these you can create points on your own GMap applications. After feeding it a dozen addresses with different levels of completeness (full postcode, partial postcode, city and town, just city etc.) I’m not that impressed with it’s accuracy when putting locations on the map. Read on →

If you add a NOPASSWD directive in your sudoers file then you can, as you’d expect from its name, use those commands without a password. This is a pretty useful trick that allows you to set up sudo entries that allow commands to be run with different privileges from cron without requiring the setuid flag. However twice this week I’ve seen a similar question asked on mailing lists and I thought I’d stick this entry up, hope google indexes it and saves me from ever seeing it again. Read on →

For a small play project I needed the ability to pull down all the DVDs from a given persons Amazon wishlist. After a quick look on CPAN two main options presented themselves, first up we have WWW::Amazon::Wishlist. The module has an easy to use interface, doesn’t require an Amazon developer token (it’s a naughty screen-scraper) and doesn’t need any XML modules. Unfortunately while it has no problems getting books I couldn’t get it to download any of the DVDs from the wishlist so I moved on. Read on →

The bash shell gets more negative press than it deserves from most “real” programmers. Between the “I can’t see what it’s doing, I need an echo after nearly every line!” and the “Why doesn’t it have a check option like perl’s -c!?” most people that only occasionally dip in to bash end up frustrated by it’s lack of features. All because they can’t be bothered to read the man page… I’m going to show you three simple bash “tricks” that’ll make your script debugging a lot easier; and none of them require that much searching to find!. Read on →

I get a lot of email, personal, mailing-lists and other, odder, sources (CVS commits for example) and the only mail client I’ve ever felt productive in is mutt. It’s a very simple, easy to use, client that hides a staggering amount of power behind a few key-presses; the fact it lets me use vim as my editor is also a killer feature. What makes mutt a joy to use is that every now and then I’ll stumble on to something new that I’ve never noticed before; today that was tab-completion when saving mail. Read on →

Most people know you can change the readline settings to either vi or emacs style key-bindings, but far less people know you can actually open the current, or a previous, command line in your editor of choice using ‘fc’. If you type ‘fc’ on the command line then the previous command will be open in the defined editor; if you want to select a further back command you can use ‘fc pattern’. Read on →

While reading through Red Handed, a Ruby blog, I stumbled on to an entry about Akira Tanaka’s CVS repository. If you like Ruby then it’s well worth spending ten minutes having a look through his projects, while the code does what it’s supposed to some of his little tools are real niche fillers; and project-name is an ideal example. When run with a single argument project-name goes away and queries a number of different sites, it checks the availability of domain names that consist of the query string and a number of different .tlds, it polls SourceForge, Savannah, the Ruby Application Archive, Freshmeat (but only checking the string against existing projects short names, not the full names!) and does a google count of the term you’re searching on. Read on →

I’ve added a short Perl script called Display Feed Last Modified Date to the miniprojects page. This short (and by no means complete) script looks through a SharpReader OPML file (which can be generated by using ‘Export’ on the file menu) and then tries to obtain and display a Last-Modified date for each feed in the file (this is gathered from the header of the same name) With a single run and five minutes of manual checking of feeds I’ve managed to find and remove 40 dead feeds from my subscription list.


It never fails to surprise me how I can use a program almost every day and yet still stumble on to previously undiscovered options. Yesterday I discovered the ‘–reference=file’ option while reading the manpage for chmod. When used this option takes the current permissions of the specified file and applies them to the other files specified on the command line. It’s also accepted by chgrp and chown. Note: If you’re going to use this in production please consider the potential race condition.

Occasionly you will pipe or cat a file to the screen or a program will die and the screen will begin to show gibberish when ever you type anything (I don’t mean the usual gibberish that most people type on a command line :)) If you use putty then you will see the word ‘PuTTy’ appear contantly. The quick way around this is to type ‘reset’ and the screen will begin to work as expected again.

After you’ve been using a Unix (or logging into one via putty) for a while you’ll probably encounter a key combination that locks the term and leaves you unable to do anything. You’ll hunt around the keyboard pressing combinations until you sigh in despair and try Ctrl-C or Ctrl-D to kill the current command or the current shell respectively; and they won’t work. After some more key-bashing you’ll get lucky and the term will bomb out. Read on →

As part of my daily server housekeeping I keep an eye on the Apache error logs for each of the servers I’m responsible for. If it’s a quiet day I’ll grep through the attempted exploits, attacks and formmail scans for any useful error messages. While attempting to track some 404’s back to the corresponding access-log entries I got bored of converting the error logs date format into the default date format of the access log so I wrote a small bit of shell that I (badly) named ApacheErrorDate.sh but without the studly caps, to do it for me. Read on →

I know this is old ground but it seems to come up a lot and annoy the arse off me, if you are going to log something then please ensure it has: A date and time... ...that is easy to sort The name of the application that spawned the something you are logging The fully qualified name of the machine it is from If you can’t produce at least those details then what use do you expect the logs to be when someone tries to debug using them.

While looking through the blogs of both the DTrace engineers at Sun I stumbled upon this little gem (taken from Adam Leventhal’s Weblog): “And speaking of perl, a lot of people asked about DTrace’s visibility into perl. Right now the only non-natively executed language DTrace lets you observe is Java, but now that we realize how much need there is for visibility into perl, we’re going to be working aggressively on making DTrace work well with perl. Read on →

I had a little rant on this subject a while ago about the practises of some companies when it comes to evalling software. After some more digging I found a solution I was happy to recommend for the task; webMethods Glue. I had some trouble using the generated WSDL with a Perl SOAP::Lite server but it was nothing ten minutes fiddling didn’t solve. While I’ve only looked at the java2wsdl converter that single component did exactly what I wanted.

I’m posting this for my own benefit as much as anyone else’s. Ispell has some support for HTML / XML documents, if invoked with ‘-h’ it will not spell-check certain parts of the document as the rules below show: This element name is misspelled: <elemment>element</elemment> This attribute name is incorrect: <tag nme="Dean" /> The value of this attribute is wrong: <tag animal="Elepant" /> Of the three lines above none get kicked out as errors. Read on →

The watch command is one of those little gems that often gets overlooked and has its functionality duplicated by a custom tool; just slower and more complicated. At its most basic watch runs the specified command every two seconds until interrupted, a simple example that shows the current directories content is given below, this will show any changes in either the size or timestamp of the contents. watch ls -ahl Watch excels in showing real-time differences, by supplying either ‘-d’ or the long option ‘–differences’ any changes will be marked on screen using inverse colours. Read on →