Wed, 15 May 2013
Facter 1.7+ and External facts
While Puppet may get all the glory, Facter,
the hard working information gathering library that can, seldom gets much
exciting new functionality. However with the release of Facter 1.7
Puppetlabs have standardised and included a couple of useful facter
enhancements that make it easier than ever to add custom facts to your
puppet runs.
These two improvements come under the banner of 'External Facts'. The first
allows you to surface your own facts from a static file, either
plain text key value pairs or a specific YAML / JSON format. These static
files should be placed under /etc/facter/facts.d
$ sudo mkdir -p /etc/facter/facts.d
# note - the .txt file extension
$ echo 'external_fact=yes' | sudo tee /etc/facter/facts.d/external_test.txt
external_fact=worked
$ facter external_fact
worked
At its simplest this is a way to surface basic, static, details from system provisioning and other similar large events but it's also an easy way to include details from other daemon and cronjobs. One of my first use cases for this was to create 'last_backup_time' and 'last_backup_status' facts that are written at the conclusion of my backup cronjob. Having the values inserted from out of band is a much nicer prospect that writing a custom fact that parses the cron logs.
If that's a little too static for you then the second usage might be what you're looking for. Any executable scripts dropped in the same directory that produce the same output formats as allowed above will be executed by facter when it's invoked.
# scripts must be executable!
$ sudo chmod a+rx /etc/facter/facts.d/process_count
$ cat /etc/facter/facts.d/process_count
#!/bin/bash
count=$(ps -efwww | wc -l | tr -s ' ')
echo "process_count=$count"
$ facter process_count
209
The ability to run scripts that provide facts and values makes customisation easier in situations where ruby isn't the best language for the job. It's also a nice way to reuse existing tools or for including information from further afield - such as the current binary log in use by MySQL or Postgres or the hosts current state in the load balancer.
While there have been third party extensions that provided this functionality for a while it's great to see these enhancements get included in core facter.
Like this post? - Digg Me! | Add to del.icio.us! | reddit this!
Posted: 2013/05/15 20:46 | /tools/puppet | Permanent link to this entry | This entry and same date
Sat, 27 Apr 2013
Deprecation Warnings From Puppet Resources
Over time parts of your puppet manifests will become unneeded. You might
move a cronjob or a users in to a package or no longer need a service to be
enabled after a given release. I've recently had this use case and had two
options - either rely on comments in the Puppet code and write an out of
band tool to scan the code base and present a report or add them to the
puppet resources themselves. I chose the latter.
Below you'll find a simple metaparameter (a parameter that works with any resource type) that adds this feature to puppet. As this is an early prototype I've hacked it directly in to my local puppet fork. Below you'll see a sample resource that declares a deprecation date and message, the code that implements it and a simple command line test you can run to confirm it works.
# sample puppet resource using :deprecation
file { '/ec/cron.d/remove_foos':
ensure => 'file',
source => 'puppet:///modules/foo/foo.cron',
deprecation => '20130425:Release 6 removes the need for the foo cronjob',
}
$ sudo vi puppet-3.1.1/lib/puppet/type.rb
newmetaparam(:deprecation) do
desc "
Add a deprecation warning to resources.
file { '/etc/foo':
content => 'Bar',
deprecation => '20130425:We no longer need the foo'
}
The deprecation comes in two parts, separated by a :
The date is in format YYYYMMDD and the message is a free form string.
"
munge do |deprecation|
date, message = deprecation.split(':')
# YYY MM DD - one true timestamp
now = Time.now.strftime('%Y%m%d')
if (now >= date)
rsrc = "#{@resource.type.capitalize}[#{@resource.name}]"
Puppet.warning "#{rsrc} expired on #{date}: #{message}"
end
end
end
# command line test
$ puppet apply -e 'file { "/tmp/dep": content => "foo\n", deprecation =>
"20120425:We can remove this file after release 4" }'
Warning: File[/tmp/dep] expired on 20120425: We can remove this file after release 4
Notice: Finished catalog run in 0.06 seconds
Using the metaparameter is easy enough, just specify 'deprecation' as a property on a resource and provide a string that contains the date to start flagging the deprecation on (in YYYYMMDD format) and the message puppet should show. I don't currently fail the run on an expired resource but this is an option.
The are some other aspects of this to consider - Richard Clamp raised the idea of having a native type that could indicate this for an entire class (I'd rather use a function, but only because they are much easier to write) and Trevor Vaughan suggested a Puppet face that could present a report of the expired, and soon to be expired, code.
I don't know how widely useful this is but it made a nice change to write some puppet code. The small size of the example will hopefully show how easy it is to extend nearly every part of puppet - including more 'complicated' aspects like metaparameters. Although not the relationship ones, those are horrible ;) I've submitted the idea to the upstream development list so we'll see what happens.
Like this post? - Digg Me! | Add to del.icio.us! | reddit this!
Posted: 2013/04/27 11:53 | /tools/puppet | Permanent link to this entry | This entry and same date
Mon, 11 Feb 2013
Puppet Camp - Ghent 2013
It's been a while since I've attended a Puppet Camp but considering the
quality of the last one (organised by Patrick Debois) and the fact it was
being held in the lovely city of Ghent again I thought it'd be a wise
investment to scrape together the time off.
The quality of the talks seemed quite high and considering the number of newer users present the content level was well pitched. A couple of deeper talks for the more experienced members would have been nice but we mostly made our own in the open sessions. Facter, writing MCollective plugins, off-line and bulk catalogue compilation and the murky corners of our production puppets all came under discussion - in some cases quite fruitfully.
The wireless was a point of annoyance and amusement (depending on the person and the time of day). We had 20 users for an audience of ten times that - the attitudes covered the gamut from "I only need to check my mail once a day" to "I have my own tethering" and all the way to "This is my brute force script I run in a loop". You can tell when most of us lost our access based on the twitter hash tag.
I was a little surprised at the number of Puppet Camps there will be this year - 27 was the number mentioned. I think a lot of the more experienced members of the community value the camps and confs as a chance to catch up with each other and the PuppetLabs people and I'd hate to see us sticking to our own local camps and losing the cross pollination of ideas, plans and pains.
You can also view the Puppet Camp slides for a number of the sessions.
Like this post? - Digg Me! | Add to del.icio.us! | reddit this!
Posted: 2013/02/11 13:11 | /tools/puppet | Permanent link to this entry | This entry and same date
Sun, 27 Jan 2013
Prettier Puppet with Pocco
Back in October Nan Liu announced
"pocco - a puppet manifest documentation experiment" as a way of
generating much nicer looking documentation for puppet classes (you can see
an example and reducing the amount of boilerplate needed
to document your classes.
After some issues with the ruby libraries it depends on, I ran it over a couple of my smaller manifests and I have to say the output is very readable and quite presentable. If you write manifests for other peoples use then this is well worth a look.
Like this post? - Digg Me! | Add to del.icio.us! | reddit this!
Posted: 2013/01/27 11:10 | /tools/puppet | Permanent link to this entry | This entry and same date
Tue, 05 Jul 2011
Introduction To DSAC
A while ago @ripienaar and I had a chat
in a pub about monitoring, event systems and lots of related subjects. As
we all know he's way more productive than is fair and so while he's been
doing a BUNDLE of work with on subjects like monitoring frameworks and event correlation
I've been doing some thinking (and no actual coding) about event
auditing, continuous compliance and security event management.
Now I've finished the $TIMESINK_PROJECT I'm soon going to actually need some of this stuff so I've started putting together a prototype framework that I'm calling DSAC - Dump Send and Correlate. The code is in a very early stage at the moment but is dealing with a small number of agents on a test network of a couple of hundred nodes. I'm going to start documenting the sections as it becomes ready for more public consumption but I thought I'd show my architectural plans for version 0.1.
The architecture is quite simple at the moment. Every node runs the "consumer and dispatch" stack which generates events, currently all events are made from cron invoked agents. A separate process, also cron invoked (for now) then runs through the spool and invokes all the dispatchers that have registered an interest in the output of that agent. Simple dispatcher examples are an AMQ pusher or a MySQL loader.
At the other end of the process, and quite symmetrically, we have the consumer stack. This reads from the nice big fuzzy cloud of transient data loss and spools files for later processing. We then have another process pick the files up and run them through a number of processors.
I've got working prototypes of a simple bulk archiver and some debugging aids but I can also envision some more useful real time dashboards. The last stage at the moment are the simple reports. I'm currently focusing on the easier reports that will help me show changes to an auditor, package updates, service status changes and user logins but this step will hopefully expand to encompass a lot of our rote compliance needs.
Once I've tidied up the code (and picked up some more ruby!) I'll start putting the bits I work on in my spare time on github.
Like this post? - Digg Me! | Add to del.icio.us! | reddit this!
Posted: 2011/07/05 17:45 | /tools | Permanent link to this entry | This entry and same date
Mon, 20 Jun 2011
Simple Puppet module grepper (prototype)
<tl;dr> Search for puppet resources values using puppet, not just
plain text</tl;dr>
One of the ideas that has been sitting on my todo list is having a command that lets me grep a puppet manifest for certain properties, values or even just resources in a smarter way than just running a raw grep over files. While a simple grep works in some cases it is annoyingly fragile when you're trying to ignore literal strings in resource types that you're not interested in or narrow your search down to resources that have a property that can also appear in other types.
# Show all file resources with a mode of 644
$ pm-grep -t file -p mode -v 644 files.pp
# Show all host resources with an alias of any value
$ pm-grep -t host -p host_aliases hosts.pp
# Check a number of pp files at once
$ find /etc/puppet/modules/ -name "*.pp" | xargs -n 1 pm-grep -t file -p mode
pm-grep (puppet manifest grep) isn't anywhere near finished but it does work on simple manifests. It yet doesn't handle corner cases, global parameter defaults and a number of other more advanced techniques but it does fulfil some of my needs and has given me some more to mull over for version 2.
Like this post? - Digg Me! | Add to del.icio.us! | reddit this!
Posted: 2011/06/20 23:36 | /tools/puppet | Permanent link to this entry | This entry and same date
Thu, 16 Jun 2011
Smarter Service Status in Puppet
While most people know you can use puppet to
ensure a service is running
the mechanism it uses to determine if a service is actually running is often unexplored.
By default (at least up to Puppet 2.6) puppet assumes that a service doesn't supply a working status option and so will look up the services name in the process table to check if it's running. If your service does support the status argument you can set 'hasstatus => true' and the platforms service provider will be used to interrogate the services current status.
While most services only report a simple status of running or not running puppet, when you've specified 'hasstatus => true' puppet will consult a second property, if it's present, - status - which is where things get a little more interesting and extendable.
# puppet manifest
service { "httpd":
ensure => "running",
hasstatus => true,
status => "/usr/local/bin/puppet-status-http-check",
}
# puppet-status-http-check - example check
#!/usr/bin/perl
use strict;
use warnings;
my @checks = (
"/usr/lib/nagios/plugins/check_procs -C httpd",
"/usr/lib/nagios/plugins/check_http -I 127.0.0.1",
"/usr/lib/nagios/plugins/check_http -I 127.0.0.1 -u /about",
"/usr/lib/nagios/plugins/check_http -I 127.0.0.1 -u / -s udlab",
);
for my $check ( @checks ) {
$check .= " 2>&1 > /dev/null"; # suppress output
system( $check ) == 0 or exit 1;
}
# when running under debug you'll see a line like:
debug: Service[httpd](provider=redhat): Executing '/usr/local/bin/puppet-status-http-check'
By specifying our own command in the status property we can do more complex, and domain specific, status checks. For example we don't so much care that apache is running as that it's serving our chosen vhosts correctly. You can use any command as the right hand side of status and puppet will treat a return code of 0 as confirmation that the service is running and anything else as a failure; which will trigger an attempt to restart the service in our example.
One possibility is to tie this in to nrpe-runner with a carefully chosen command name pattern to reap all the benefits of your already defined nagios checks.
Like this post? - Digg Me! | Add to del.icio.us! | reddit this!
Posted: 2011/06/16 16:22 | /tools/puppet | Permanent link to this entry | This entry and same date
Sat, 14 May 2011
Wrapping MCollective with Nagios
I've been doing a little tinkering with pre/post release checklists and
compliance reporting using cucumber and some Nagios wrapping (among
other things) in my test lab and recently needed to do some higher level
entire environment checks before moving on to the next step. While it's
possible to wrap something like nmaps ping check and then Nagios each
target it does feel like stepping back a few years in the tool
chain.
Luckily I'm running MCollective, so all this synchronous discovery and polling is in my past. After a little bit of delving in to the existing package and service clients I've come up with a prototype environment wide MCollective backed service check and an MCollective backed package check.
I'm not sure if I'd be willing to replace existing low level checks (for things like cron and ssh processes) with this just yet but it does show how easy it is to wrap MCollective with third party code in order reap its benefits from further down the tool chain. With a little scaffolding hopefully it'll be useful in validating individual policies in security policies and guidelines. But more about that later.
Phase two is probably to pull the scripts together (and just use another parameter to select the resource to check) and to be green or red based on percentage. As an example, requiring 40% of the web servers to be returning 200 before starting the next batch of host upgrades.
Like this post? - Digg Me! | Add to del.icio.us! | reddit this!
Posted: 2011/05/14 16:55 | /tools | Permanent link to this entry | This entry and same date
Tue, 22 Mar 2011
Listing Puppet Managed Files
Sometimes it's the little niggles that annoy people the most. As my team
progress in to puppet they have an annoying habit of asking very good
questions; which can sometimes be a struggle to answer. Todays best
question was - "How do I tell if this file is under puppets
control?"
While there are a couple of different ways to check (grepping through your git checkout or modifying the file and running puppet were the immediate winners) the best way is probably to look inside the catalog and check against the title of the File resources it contains. While this gets you most of the way the problem is a little harder than it looks because of an edge case. If puppet is managing an entire directory then the files in that directory are not explicitly listed in the catalog.
So we need to look in two places, the catalog and state.yaml. Remembering the greps (and the line transformations needed) requires more mental space than I'm willing to invest so I've written puppet-ls to do all the work for me.
$ puppet-ls /etc/mcollective
/etc/mcollective/facts.yaml
/etc/mcollective/server.cfg
Run the command, specify the directory to check and any shown files are puppet managed. It's not a ground breaking script but it can help people migrating to puppet as they bring more of their systems under its control.
Like this post? - Digg Me! | Add to del.icio.us! | reddit this!
Posted: 2011/03/22 22:54 | /tools/puppet | Permanent link to this entry | This entry and same date
Mon, 21 Mar 2011
Nagios Wrapped Puppet Runs
<tl;dr>Log nrpe-runner state changes when puppet runs to see what
broke or was fixed.</tl;dr>
While people most often use puppet to configure and repair their infrastructures sometimes they also inadvertently use it to damage and cripple them. As part of my attempt to reduce the mean time to spot a mistake across my systems I've come up with a handful of small scripts that let me wrap a puppet run in a Nagios NRPE powered safety net.
One of the lesser known features introduced in Puppet 0.25.4 (and still valid in 2.6) were the prerun_command and postrun_command hooks. These two config settings allow you to specify a command to run at the beginning (which can stop the puppet run from happening) and at the end of a puppet run. While they were originally devised to make integration with etckepper simpler we can also use them to add some additional monitoring to our runs.
We've already covered my nrpe-runner, which lets you run Nagios checks locally for immediate feed back but now let's expand the idea a little for puppet integration. Our plan is simple, invoke nrpe-runner and gather the output, run puppet, re-run the nrpe-runner and see which checks puppet has fixed or broken.
First of all we deploy nrpe-runner, our nrperunner json differ and the (below) wrapper script we use for when puppet's finished running.
$ cat nrpe-wrapper
#!/bin/bash
/home/deanw/puppet-wrapper/nrpe-runner -j > /tmp/post_puppetrun
logger -t "puppet-nrpe" `/home/deanw/puppet-wrapper/nrperunner-json-differ /tmp/pre_puppetrun /tmp/post_puppetrun`
We then add the config to puppet.confs main section. While it's possible to insert longer lines for each command and skip the wrapper script puppet is a little fiddly about these settings and a separate script is easier to use.
$ cat /etc/puppet/puppet.conf
[main]
... snip ...
prerun_command = /home/deanw/puppet-wrapper/nrpe-runner -j > /tmp/pre_puppetrun
postrun_command = /home/deanw/puppet-wrapper/nrpe-wrapper
Now we've done all the prep (and if needed restarted puppet) let's break something and see if we get both a fix and confirmation:
# stop something we know puppet will fix.
$ /etc/init.d/mcollective stop
$ puppetd -vt
info: Retrieving plugin
.. snip ...
notice: //mcollective::server/Service[mcollective]/ensure: ensure changed 'stopped' to 'running'
notice: Finished catalog run in 5.51 seconds
# see if we logged the fix... we did!
$ tail -n 1 /var/log/messages
Mar 21 22:07:21 lb03-dynm puppet-nrpe: mcollective_procs changed from 2 to 0
While our simple wrapper just sends the output directly to syslog hopefully you've got an idea how powerful this integrated immediate feedback can be. While it's always been possible for us to dig back through the logs and spot something breaking after a puppet run, by explicitly wrapping the run we can cut done the investigation time while also providing information for later review and discussion.
Like this post? - Digg Me! | Add to del.icio.us! | reddit this!
Posted: 2011/03/21 22:56 | /tools/puppet | Permanent link to this entry | This entry and same date
Thu, 17 Mar 2011
Puppet Cucumber Providers
At work we try, and sometimes even succeed, in using Test Driven
Deployment so as one of my background projects I've been wrapping certain tools in to
cucumber friendly forms. Over the last couple of days I've been grabbing
ten minutes here and there to incorporate Puppet 2.6 in to the pile.
Feature: Puppetwrappers
Puppet Provider Examples
Scenario: Confirming package installation
When a machine has been puppeted
Then the bash package should be installed
Scenario: Confirm doodoodoo package is absent
When a machine has been puppeted
Then the doodoodoo package should not be installed
Scenario: Confirm cron service is running
When a machine has been puppeted
Then the cron service should be running
Scenario: Confirm tomcat6 service is not running
When a machine has been puppeted
Then the tomcat6 service should not be running
Scenario: Confirm dwilson is in libvirtd group
When a machine has been puppeted
Then dwilson should be a member of libvirtd
Scenario: Confirm dwilson has a uid of 1000
When a machine has been puppeted
Then dwilson should have a uid of 1000
Scenario: Confirm dwilson has a given shell
When a machine has been puppeted
Then dwilson should have the /bin/bash shell
I really like using the puppet providers for this because of the abstraction benefits they provide. I can write steps to test packages, services or aspects of a user and not have to worry if a developer runs it on Fedora or Debian.
This is only a first draft, and the cucumber wording needs changing, but I thought I'd put it online to show how expressive cucumber can be for system tasks and how easy, and concise, it is to reuse the puppet providers. You can grab the puppet step code and the Puppet providers features to drop in to your own test harnesses and have a play with. The implementation is pretty simple, for example the code below is everything you need for the service scenarios:
Then /^the (.+) service should be running$/ do | service |
service_status = Puppet::Type.type(:service).new(:name => service, :hasstatus => true).provider.status
service_status.should == :running
end
Then /^the (.+) service should not be running$/ do | service |
service_status = Puppet::Type.type(:service).new(:name =>service).provider.status
service_status.should == :stopped
end
It's worth mentioning that all the above will only work in 2.6 and above as the internal details returned by the providers are different to those in 2.5.
Like this post? - Digg Me! | Add to del.icio.us! | reddit this!
Posted: 2011/03/17 19:16 | /tools/puppet | Permanent link to this entry | This entry and same date
Mon, 14 Mar 2011
Find Unpuppeted SSH Keys
It all started with one of those annoying little items on the todo list
- find all the unpuppeted ssh authorized_keys files on a machine and
alert on them. On first impressions it was going to be quite manual
(always a bad sign), involve digging in to legacy installs and would be
something we'd need to re-verify occasionally. It couldn't be that bad
though could it? After all how many places can an unmanaged-by-
puppet sshkey live?
Essentially the task can be broken in to three main parts. The first, quite easy part, is to grab a list of all the users (hello /etc/passwd) and look for known key file names in their home directories. The second part, which was a little harder, is to build a list of all the authorized_keys files that puppet knows it's managing for this host. Lastly once you have the two collections find the differences. Instead of doing static analysis on the puppetmasters classes and modules we're going to focus on how to do it using the compiled desired state of what the local machine should look like, according to the puppet catalog.
The catalog (which lives at /var/lib/puppet/client_yaml/catalog/$fqdn.yaml in modern puppet) is a yaml-based representation of what puppet knows about how the local system should be configured. It contains details of all the resources to be managed on the local machine and their desired end state; which makes it perfect for our needs. I'm not going to go into the catalog in depth in this post but hopefully this little example will whet your appetite and spark some ideas.
Our example, the audit-sshkey-files nagios check, was actually quite easy to write (after some digging in to puppet and borrowing some code from Puppet Catalog Diff by R.I.Pienaar) and should hopefully show how much you can gain from using the meta-data puppet provides.
While most of the audit-sshkey-files script is boilerplate the most important snippet is below:
if target.type == "File" and target.title.include? "/authorized_keys"
@puppet_keys.push target.title
return target.title
end
All we're doing is building a list of any resources that are of type file and include the string "/authorized_keys" in their name (resource title in puppet terms). While this may not seem like much it's potentially game changing, any resources or relationships that you've modelled in puppet can be later mined to add context to your other tools. You can (as we have here) audit security related files or find user ids puppet doesn't know about and so might be inconsistent over systems. By using the catalog and the relationships and meta-data it provides you can make much more of your investment in deploying systems with puppet, and hopefully this little example presents an easy way to get started.
Now I've gushed about what the puppet catalog can do for you there are two caveats, firstly about my example. It isn't a complete solution, for example it doesn't look for other allowed "authorized_keys" filenames that are defined in the sshd_config file. But it does the 80% of what I needed in our environment and by managing the sshd_config file in puppet (as you should be) it's easy for me to double check I'm looking for the correct files. Secondly about the Puppet catalog itself. Harnessing its contents doesn't exactly have a shallow learning curve and documentation is a little thin on the ground. The original author of puppet Luke Kanies is working on some alternative ways of accessing this kind of information (such as via his Puppet Interfaces project) and as more people build their puppet deployments you can expect so see more and more harnessing of this additional structure.
Like this post? - Digg Me! | Add to del.icio.us! | reddit this!
Posted: 2011/03/14 23:30 | /tools/puppet | Permanent link to this entry | This entry and same date
Reusing Puppets Package providers
One of puppets more under-appreciated features is its ability to abstract
and smooth the edges of certain operating system tasks and behaviours.
Even something as trivial as installing a package can actually become a
portability nightmare once you consider the number of different systems
in the wild - rpm, yum, dpkg, pkgsrc etc. - and the varied commands
needed to use them. You end up either hard coding commands, and sacrificing
portability, or writing your own detection, lookup and invocation
logic.
That sounds like, dull, scut work so how does puppet deal with this? And how can we reuse this work to simplify our own code? In slightly simplified terms, Puppet has a package type, which is backed by a number of providers. Each of these providers actually implement the required functionality for a given package manager and contains all the code we need. So how do we harness this existing work? Quite easily. Luckily for us, puppets providers are written in ruby code and are simple to call in our own scripts:
# show package version
$ irb
irb(main):001:0> require 'puppet'
=> true
irb(main):002:0> Puppet::Type.type(:package).new(:name => "bash").provider.properties
=> { :provider=>:yum, :ensure=>"4.1.7-3.fc14", :release=>"3.fc14",
=> :arch=>"i686", :epoch=>"0", :name=>"bash", :version=>"4.1.7" }
# do the same thing with an explicitly specified provider.
irb(main):003:0> Puppet::Type.type(:package).new(:name => "bash", :provider => "rpm").provider.properties
=> { :provider=>:rpm, :ensure=>"4.1.7-3.fc14", :release=>"3.fc14",
:arch=>"i686", :epoch=>"0", :name=>"bash", :version=>"4.1.7" }
While that snippet will hopefully whet your appetite if you need a more worked example I've put a small Puppet Package Provider wrapper up on github. The script will enable you to do the basic install, update and delete without knowing or caring what the underlying package manager is. Hopefully these little code snippets will help you stop thinking of puppet as "just" a tool and show how parts of its code base can be used as a framework to improve other parts of your tool chain.
As an aside it's also worth mentioning that you can globally Change the Package provider in puppet if you're not happy with its auto-detection.
Like this post? - Digg Me! | Add to del.icio.us! | reddit this!
Posted: 2011/03/14 00:18 | /tools/puppet | Permanent link to this entry | This entry and same date
Wed, 09 Mar 2011
Introducing NRPE Runner
It might be a sign that I spend too much time online but the quicker a
system gives me feedback the more useful I find it. While I love knowing
my Nagios safety net has me covered when making changes sometimes waiting
for that cgi to refresh can take too long, especially if I'm taking a
iterative / test driven approach to the changes I'm making. For those
use cases I wrote nrpe-runner.
The way I typically use Nagios is to have the Nagios server run the
checks on the remote host via the NRPE plugin. The checks to be run on the host are
normally stored in a config file with each entry looking like this:
command[local_mail]=/usr/local/libexec/nagios/check_local_mail
While this allows you to run each check to confirm that it's still OK I
wanted the ability to run all the commands in the file at once, which I can
now do with nrpe-runner. If every thing's fine then
it exits silently, to confirm that it's actually run I can summarise and
even filter the checks to run:
# show everything as it's run whatever the return status
/usr/local/sbin/nrpe-runner -a
check_swap => SWAP OK - 100% free (16041 MB out of 16041 MB) |swap=16041MB;12031;9624;0;16041
... snipped ...
freemem => OK: 12% (1732M) free memory.
# show a summary
$ /usr/local/sbin/nrpe-runner -s
Ran 39 checks - OK 39. WARN 0, CRIT 0, UNKNOWN 0
# run any checks with ntp in the name (the part between [])
$ /usr/local/sbin/nrpe-runner -s -n ntp
Ran 3 checks - OK 3. WARN 0, CRIT 0, UNKNOWN 0
# run all process checks (checks the command after the '=')
$ /usr/local/sbin/nrpe-runner -s -c proc
Ran 17 checks - OK 17. WARN 0, CRIT 0, UNKNOWN 0
# show all checks named ntp
$ /usr/local/sbin/nrpe-runner -a -n ntp
ntp_skew_primary => NTP OK: Offset -0.003149271011 secs|offset=-0.003149s;5.000000;9.000000;
ntp_process => PROCS OK: 1 process with command name 'ntpd', args '-u ntp:ntp'
ntp_skew_secondary => NTP OK: Offset -0.002887368202 secs|offset=-0.002887s;5.000000;9.000000;
nrpe-runner also has the option to dump the results as json, which I'll be exploring a little further in my next couple of blog posts. While it's not exactly the same as having the checks run by nagios (the user and environment are often different) I've found that shortening the interval between running puppet or yum and seeing the nagios feedback has helped my work-flow quite a lot when making exploratory system changes - and even more when nothing should have changed but does...
Like this post? - Digg Me! | Add to del.icio.us! | reddit this!
Posted: 2011/03/09 19:05 | /tools/commandline | Permanent link to this entry | This entry and same date
Tue, 04 Jan 2011
Puppet CookBook is live
Between Xmas and New Year I had some spare time to invest on a side
project I've been looking forward to working on for quite a while. I'm
pleased to announce the opening of the Puppet
CookBook.
I've introduced Puppet to quite a few companies, sysadmins and development teams over the years and a lot of the same issues, concepts and needs repeatedly crop up. By explaining how puppet works in terms of tasks and desired outcomes rather than in raw feature descriptions I hope to show some of its power and flexibility in easy to use examples in a different way to most of the existing documentation.
The site isn't exactly brimming over with content yet (and it's pretty ugly) but I'm adding a handful of posts each week and hope to cover some more advanced topics over the next couple of months. You can follow the Puppet CookBook Twitter account for update announcements or to send feedback or suggestions for future topics.
Like this post? - Digg Me! | Add to del.icio.us! | reddit this!
Posted: 2011/01/04 22:59 | /tools/puppet | Permanent link to this entry | This entry and same date
Sat, 11 Dec 2010
Clarifying With Facter
While adopting a configuration management tool like Chef and Puppet will
have a large, nearly immediate effect on your work flow even after using the
tools for a while you'll still get a little smile at all the little niceties
you continuously discover.
One recent small win we had recently was bringing some apache configs files under Puppet command. When we started we had the following block of config:
RewriteCond %{REMOTE_ADDR} !10.23.143.33
RewriteCond %{REMOTE_ADDR} !10.23.143.2
RewriteCond %{REMOTE_ADDR} !10.23.143.3
It's not hard to read and roughly understand what it does, but you have no real context; magic numbers keep things terse but are rarely the most helpful when in the land of a strange system. After putting the configs in to a module and abstracting them a little into a template we have the much nicer:
RewriteCond %{REMOTE_ADDR} !<%= primary_loadbalancer %>
RewriteCond %{REMOTE_ADDR} !<%= secondary_loadbalancer %>
RewriteCond %{REMOTE_ADDR} !<%= ipaddress_eth0_mgmt %>
As part of the tidy up we also renamed some of the (remarkably large amount of) Ethernet interfaces to describe what they were for, rather than leaving them as eth12:34
Like this post? - Digg Me! | Add to del.icio.us! | reddit this!
Posted: 2010/12/11 21:35 | /tools/puppet | Permanent link to this entry | This entry and same date
Mon, 15 Nov 2010
MCollective Plugin - FileMD5er
I've been watching the Marionette Collective for a
while, and even gave it a small trial in a couple of testing
environments, but this weekend was the first time I've experimented
with it at a slightly larger scale (just over a hundred small VM nodes -
you have to love EC2) and I'm still impressed.
I can see how it's going to make parts of my work flow easier, and in an
attempt to learn a little more about how the plugin system works under the
hood I decided to write a small agent, FileMD5er.
The agent itself is very simple and addresses a small annoyance I've
scripted around for a while. When you're bringing files under Puppet (or
Chef) management you need to dig through the hosts and locate any files
with differences compared to the most common adhoc file. With a quick
mc-filemd5er /path/to/file I can easily spot any machines
that have a slightly different version of the file, and then fold them
in to centralised management.
Writing the plugin itself was quite easy. The two problems I encountered were finding the right generation of existing plugin to crib from (some of the official MCollective Plugins are of a newer format than others) and not naming the class and the .rb file the same name. Which caused it to half work.
I'll be putting more of my MCollective Plugins on Github as the become a little more generic and hopefully useful to someone else.
Like this post? - Digg Me! | Add to del.icio.us! | reddit this!
Posted: 2010/11/15 23:26 | /tools/puppet | Permanent link to this entry | This entry and same date
Mon, 23 Aug 2010
Adventures in Cronologger
Cronjobs are one of those necessary evils of any decent sized Unix setup,
they provide often essential pieces of a sites data flows but are often
treated as second class citizens. While I've already mentioned my Cron
commandments I'm always looking for improvements in the
rest of my cron tool set and, with Vladimir Vuksan's cronologger, I may have
found another piece of the puzzle.
The concept is simple, you add a command to the front of your crontabs and it invokes your actual cron command. This wrapper script collects the stdout, stderr and some other details such as exit code and run time. The backend is a couchdb data store and the simple reporting pages are written in PHP, and are easy to work through, crib and base your own reports from. Having all this cron information also helps provide a talking point with development, it's easy to show progress and imbue a sense of actually getting somewhere when the number of cronjobs with errors drops each day, rather than the systems team mentioning that their email boxes are a little emptier since the last release.
While our initial tests seem positive there are a couple of reports and tweaks to the command line data injector that we want for our local usage. The biggest problem with the project may well be that the idea is so obviously correct that we end up re-implementing it in something a little more suitable for our environment. Maybe a Python command line client and Perl Template Toolkit driven reports to replace the PHP. But that's a possibility for later - for now cronologger is a great 80% solver.
Like this post? - Digg Me! | Add to del.icio.us! | reddit this!
Posted: 2010/08/23 21:49 | /tools | Permanent link to this entry | This entry and same date
Sun, 06 Jun 2010
Netbeans vs Commandline
The last time we interviewed for Java developers (a couple of jobs
ago) it came as quite a surprise at how few of them could function
without their IDE of choice. A high percentage of the candidates
struggled to compile using javac, had problems navigating the docs and
made a large number of very simple syntax errors that they were obviously
used to their editor dealing with.
At the time the more unix focused team, most of who were very long term vim and emacs users, had a number of discussions about how this should impact our rating of the candidates. One school of thought was that people should use the tools that make them most productive. The other was that people should understand their tool chain. How can you diagnose issues on a production server if you can't even compile a class on the command line? You can tell which side I was on.
I've recently joined a small Java project and after some awkward fiddling around with ant, junit and half a dozen other jars decided to give Netbeans a chance. I was pleasantly surprised at how quickly and easily I got the same project up and running in the IDE. I don't yet have a clue how it's storing the files on disk, constructs the build or test targets and a dozen other little details but at this stage in my basic use of Java it doesn't seem to matter.
It's strange how quickly seductive all the optional extras can be and how easy it is to lose track of what you don't know while adapting to the features they offer. I'm not sure how much of it is better tooling, benefits of a strongly typed static language or just having a dedicated team behind producing a consistent development environment but it felt very easy to take baby steps with. And I'm hoping the tool continues to show me more power as my needs when using it grow.
While I'm at no risk of giving up vim for my day to day work I think I'll be investing some time in to learning one of the big three Java editors (Eclipse, Netbeans or IntelliJ) for while I'm away in the strange world.
Like this post? - Digg Me! | Add to del.icio.us! | reddit this!
Posted: 2010/06/06 12:11 | /tools | Permanent link to this entry | This entry and same date
Wed, 07 Apr 2010
Pigz - Shortening backup times with parallel gzip
While searching for a completely different piece of software I stumbled
on to the pigz application, a
parallel implementation of gzip for modern multi-processor, multi-core
machines. As some of our backups have a gzip step to conserve
some space I decided to see if pigz could be useful in speeding them up.
Using remarkably unscientific means (I just wanted to know if it's worth further investigation) I ran a couple of sample compression runs. The machine is a quad core Dell server, the files are three copies of the same 899M SQL dump and the machine is lightly loaded (and mostly in disk IO).
####################################### # Timings for two normal gzip runs dwilson@pigztester:~/pgzip/pigz-2.1.6$ time gzip 1 2 3 real 2m43.429s user 2m39.446s sys 0m3.988s real 2m43.403s user 2m39.582s sys 0m3.808s ####################################### # Timings for three pigz runs dwilson@pigztester:~/pgzip/pigz-2.1.6$ time ./pigz 1 2 3 real 0m46.504s user 2m56.015s sys 0m4.116s real 0m46.976s user 2m55.983s sys 0m4.292s real 0m47.402s user 2m55.695s sys 0m4.256s
Quite an impressive speed up considering all I did was run a slightly different command. The post compression sizes are pretty much the same (258M when compressed by gzip and 257M with pigz) and you can gunzip a pigz'd file, and get back a file with the same md5sum.
# before compression -rw-r--r-- 1 dwilson dwilson 899M 2010-04-06 22:12 1 # post gzip compress -rw-r--r-- 1 dwilson dwilson 258M 2010-04-06 22:12 1.gz # post pigz compress -rw-r--r-- 1 dwilson dwilson 257M 2010-04-06 22:12 1.gzs
I'll need to do some more testing, and compare the systems performance to a normal run while the compression is happening, before I trust it in production but the speed ups look appealing and, as it's Mark Adler code, it looks like it might be an easy win in some of our scripts.
Like this post? - Digg Me! | Add to del.icio.us! | reddit this!
Posted: 2010/04/07 08:00 | /tools/commandline | Permanent link to this entry | This entry and same date

