ParaBoost - Exploiting multi-variant execution
People
Abstract
At the beginning of the ParaBoost project we developed an approach to exploit the growing number of cores in modern processors for speeding up hard-to-parallelize computations. We developed a multi-variant virtual machine to allow the concurrent execution of multiple variants of a sequential computation, reducing the overall execution time to the time of the fastest variant. We aimed at extending the state of the art along several dimensions: allow multiple concurrent multi-variant competitions, each with its own set of competing variants; allow nested competitive computations; monitor the progress of variants to favor more promising variants; and learn and exploit a model of variant behavior to improve performance. Before we could implement our approach, two publications appeared that reported achieving these goals. We thus refocused our project on two related but different tracks. The first track moved to different kinds of applications needing a different type of multi-variant computation: computational kernels running on heterogeneous computing platforms including GPUs. Instead of enabling competitive concurrent variants, this track studies how to enable developers to express computational variants and to easily find the most efficient variant up-front. Based on a study of representative heterogeneous applications, and based on the manual optimization of the computational kernels in those applications, we will design a DSL that separates the specification of the computation from the specification of how to efficiently execute that computation. We will start with kernels representing dynamic programming and stencil computations, and with optimizations including tiling with different tile sizes and shapes, unrolling, and prefetching. We will investigate which GPU kernel optimizations are most effective for the applications we study, and how to enable developers to express the corresponding transformations in our high level DSL. The second track repurposed our VM-based multi-variant computation approach to achieve a different goal: use variants of code not for performance, but for correctness purposes, more specifically, for back-in-time debugging. We are developing an approach that allows running applications at almost full speed, and to use a variant of the application, generated from clones of the running application process, for back-in-time debugging in case the original application fails. In developing this on-failure back-in-time debugging approach, we will investigate how and when to clone the application process; when to kill clones to minimize the slowdown of the application; how to capture the interaction of the application with the environment, so that it can be replayed to steer clones towards the failure, and so it does not significantly slow down the application; how to precisely detect a failure in the original application; how to then rewrite the application in the clone to inject the instrumentation necessary for recording the execution history; how to encode the execution history to allow the efficient navigation backwards along the infection chain; and how to effectively present the execution history to the developer. Our two research tracks provide two different perspectives of multi-variant computation. They extend our work from the first phase of the project, and they exploit parallelism for improving both performance and correctness.