The del.icio.us de.dup.er - UnixDaemon: In search of (a) life

I like del.icio.us and I’ve been using it for a long while now, but what used to be one of the more handy features, the ability to subscribe to a tag, like ‘ruby’ or ‘linux’, has gradually become less useful as more and more people find old links or repost the same link. Again. And again. And, well, you get the idea.

So I wrote the del.icio.us de.dup.er script, a small perl cgi that sits between you and del.icio.us and weeds out any duplicate links. I don’t know how useful it’ll be for other people but I installed it and when comparing the amount of posts it returns to those in the unfiltered tag I’m already seeing a lot less traffic. This is only the first draft (it needs a little love and a chunk of re-writing) but it works. So I thought I’d post it. To run it you’ll need a webserver capable of running perl cgi script, a couple of non-core perl modules and an area on disk where it can write its state; it maintains a single state file for each tag. I considered making it run as a hosted service to remove these preqs but that was more than I need right now.

Notes: Anyone who hits the cgi can force it to update and potentially stop you seeing certain links, I get around this by putting in in a secure (HTTP Auth protected) part of my site. It’s also got a timeout built in, a defined number of days after it first logs a site (30 days by default) it’ll let it through again. And store it for another 30 days.