LiveSoft - Lightweight Verification of (Distributed Systems) Software
As illustrated through the advent of cloud computing, cyberphysical systems, or the Internet of Things, more and more applications are inherently distributed. At the same time, programming distributed systems is notoriously hard. Programmers have to deal with asynchrony and have to cater for partial failures -- the possibility that certain communication(s), processes, or hosts fail while others remain operational. These failures can have drastic consequences such as the missing to react to critical events or inconsistent states respectively. Limitations on existing hardware infrastructure necessitate subtle assumptions on system and failure models though to achieve efficient yet complex algorithmic solutions, whose implementation is prone to delicate defects.
Existing techniques for engineering reliable distributed systems software require much effort (e.g., program annotations in the form of invariants) thus discouraging many developers from their use; other techniques require developers to explicitly run specific tools (e.g., model checkers) which are thus easily left out and still cannot achieve complete validation.
LiveSoft investigates static techniques to verify a subset of relevant and failure-prone aspects of distributed software --- interaction between components --- in a way which is lightweight and can be integrated with compilation. Our techniques will be able to sieve out many important defects upfront by pushing software reliability into the software design process. To that end LiveSoft proposes protocol types which leverage experiences with session types yet focus on fault-tolerant distributed systems by emphasizing asynchrony, failure handling and recovery, protocol composition, security, and parameterization. A main challenge is to support different system and failure models including emerging hardware trends such as hardware transactional memory and non-volatile memory rather than hardwiring speciific notions of (a)synchrony and failures.