Large uptimes - a wonderful problem to have

When it comes to the list of problems ‘our uptimes are too high’ isn’t normally in the top five that sysadmins dread.

While having a lengthy uptime used to be a boasting point it can also hide technical issues - such as kernel upgrades you’ve applied but not enabled (unless you’re running something special like ksplice), confidence gaps in high availability systems (when was the last time you did a fail over?) and a general worry that what’s running on a host now may not be when it comes back up.

The solution? Embrace the occasional controlled reboot and exercise those HA systems. After all, any machine that can’t be rebooted without the customers noticing is a strong candidate for a single point of failure