Linux FireWire Clustering: Brain Dump

I had an interest in shared storage FireWire clustering on Linux for a while. After spending a couple of evenings learning about it and having a little play I ended up with a big text file of links and notes. Below is the slightly more rationalised version of my notes. If I ever need it again I’ll try and write them up properly, in the mean time they might serve as a useful pointer to some other traveller.

Thanks to a push by Apple FireWire is now reaching commodity status. The hardware itself is getting cheaper and widely available. Software and operating system support is getting better and as these continue to increase more people will hack around with the technology. I’ve only become interested in FireWire for one main reason; as cheap, shared storage.

It’s been possible to share SCSI disks between two or more machines for quite a while. Unfortunately the hardware required has always been expensive, uncommon and (at the lower end) had a lot of problems with one machine rebooting and causing problems for all the others. So why do people use it? Because it’s one of the best ways to do multi-node shared storage.

If we look at pure performance then FireWire isn’t going to upset the SCSI or fibre channel users too much, while it’s quick enough for basic networking (Apples IP over FireWire) it’s not really suited for a heavily used production environment; but if you have one of those then you should invest in a SCSI or fibre channel based option instead. So why would you use FireWire for shared storage?

There are two main reasons, firstly (and these articles whetted my interest in shared FireWire) Oracle have documented how to use FireWire Real Application Clusters and even their Director of Linux Engineering, Wim Coekaerts, has written about Setting Up Linux with FireWire-Based Shared Storage for Oracle9i RAC as an economical way to develop and test RAC environments without requiring two hideously expensive storage deployments; one for live and one for dev/QA. It’s worth noting that while Oracle do mention this as an option they don’t in any way (at the time of writing) support it. It’s going to be a long while before you see FireWire shared storage holding Oracle databases in live!

So how do you do it? A number of external FireWire drives (the Oxford 911 chipset seems to work well) allow multiple simultaneous logins to the device. If you have two machines with decent FireWire cards (I don’t have a list of which ones do and don’t work) and (this is VERY IMPORTANT!) a decent clustered file system (OCFS is known to work and GFS may be usable) then you can configure hosts running Linux to share the storage. For full details have a look at the Oracle articles above.

As an aside under Linux you can use software called DRDB for sharing storage across two machines. Here’s a summary taken from the DRDB site itself "DRBD is a block device which is designed to build high availability clusters. This is done by mirroring a whole block device via (a dedicated) network. You could see it as a network raid-1. DRBD takes over the data, writes it to the local disk and sends it to the other host. On the other host, it takes it to the disk there.“ While DRDB isn’t the ideal solution for everyone (for some discussion on this have a look at the Linux Clustering thread.

Notes:

  • VMWare lets you pretend to have shared SCSI disks. Virtual PC server might also let you.
  • When ever I found a mention of FireWire Clustering on Windows it got shot down very quickly. I don't see it being a Windows feature for a fair while.
  • From IEEE 1394 and RFC 2734; a viable HSI for hypercubes: "Currently, you can get 100Mbps solid using the eth1394 driver and 120Mbps to 130Mbps using ip1394 for certain packet sizes"
  • Linux FireWire Support

PS While researching FireWire Clustering I discovered how much I HATE the Google Groups interface.