Optimizing Failure Prediction to Maximize Availability

Additional information

Authors

Kaitovic I., Malek M.

Type

Article in conference proceedings

Year

2016

Language

English

Abstract

Availability of autonomous systems can be enhanced with self-monitoring and fault-tolerance methods based on failures prediction. With each correct prediction, proactive actions may be taken to prevent or to mitigate a failure. On the other hand, incorrect predictions will introduce additional downtime associated with the overhead of a proactive action that may decrease availability. The total effect on availability will depend on the quality of prediction (measured with precision and recall), the overhead of proactive actions (penalty), and the benefit of proactive actions when prediction is correct (reward). In this paper, we quantify the impact of failure prediction and proactive actions on steady-state availability. Furthermore, we provide guidelines for optimizing failure prediction to maximize availability by selecting a proper precision and recall trade-off with respect to penalty and reward. A case study to demonstrate the approach is also presented.

Conference proceedings

13th IEEE International Conference on Autonomic Computing (ICAC)

Numero ( Mese )

July

Meeting place

Würzburg, Germany

Università della
Svizzera italiana
Via Buffi 13
6900 Lugano, Svizzera
tel +41 58 666 40 00
e-mail info@usi.ch
Other contacts
Feedback on the website

People

Education

Research

Organisation

Optimizing Failure Prediction to Maximize Availability

Additional information

Faculties

Organisational units

Maps and directions

Stay in touch