Enterprise systems must guarantee high availability and reliability to provide 24/7 services without interruptions and failures. Mechanisms for handling exceptional cases and implementing fault tolerance techniques can reduce failure occurrences, and increase dependability. Most of such mechanisms address major problems that lead to unexpected service termination or crashes, but do not deal with many subtle domain dependent failures that do not necessarily cause service termination or crashes, but result in incorrect results. In this paper, we propose a technique for developing selfprotecting systems. The technique proposed in this paper observes values at relevant program points. When the technique detects a software failure, it uses the collected information to identify the execution contexts that lead to the failure, and automatically enables mechanisms for preventing future occurrences of failures of the same type. Thus, failures do not occur again after the first detection of a failure of the same type.
Software Reliability, 2007. ISSRE ''07. The 18th IEEE International Symposium on