Deferring Defects - Autonomics

Autonomics refer to the ability of computer systems to be self-managing.autonomics.ca

Here’s one that has been bothering me. Suppose you have a recurring problem that your “autonomic solution” can handle every time it occurs without any one knowing. At what point does the fact there is a treatable issue propagate up to a real person?

While an automatic “fix and tell me later” approach helps change your work from fire fighting to planned tasks what classifies a temporary problem as being important enough to warrant you investigating it? It’s hard enough to justify preventive maintenance with the current systems, if it fixes itself then you may never get given the time to investigate further.

If a problem fixes itself before any one notices or a sysadmin can look at it is it a problem?