Sed Sickness -- Whitespace Reduction

Leafing through the live source-code should be a pleasant, calming experience, instead it often becomes a game of cringe and seek. While digging through some custom bandwidth monitoring scripts i came across this gem.

cat /proc/net/dev | grep eth0 | sed -e 's/:/ /g; s/ / /g; s/ / /g; s/ / /g; s/ / /g; s/ / /g; s/ / /g; s/ / /g; s/ / /g; s/ / /g; s/ / /g; s/ / /g;'

Working left to right we have the useless use of cat. The grep command can take a file as an argument, it doesn’t need to read from standard input. This takes away one command and a | (pipe). We then move onto the bastard stepchild that is this abuse of sed. The person who wrote this is no-longer available to beat^H^H^H^H^H ask for clarification but after some head scratching it seems the author had never head of quantifiers such as + and *.

Instead it takes every instance of two spaces and makes it one space, globally. It then does it again and again until it id reduced to a single space. This is a great example for a number of reasons, wasteful repetition of code, long ugly lines and it displays a lack of knowledge of the tool. Compare the above with the following, rewritten version.

grep eth0 /proc/net/dev | sed 's/ \+/ /g'

We’ve killed the cat and shrunk the sed. The + is a quantifier, it changes the behaviour of the previous pattern, in this case it changes a ‘match two spaces’ to a ‘match one space followed by any number of spaces as long as its above two.’ This whole matched block is then substituted with a single space. The code is shorter, faster and easier to maintain. And it doesn’t make me lose another few (precious) hairs.