Even in scientific computing, code development often lacks a basic understanding of performance bottlenecks and relevant optimization opportunities. Textbook code transformations are applied blindly without a clear goal in mind. This course teaches a structured model-based performance engineering approach on the compute node level. It aims at a deep understanding of how code performance comes about, which hardware bottlenecks apply and how to work around them. The pivotal ingredient of this process is a model which links software requirements with hardware capabilities. Such models are often simple enough to be done with pencil and paper (such as the well-known Roofline model), but they lead to deep insights and strikingly accurate runtime predictions. The lecture starts with simple benchmark kernels and advances to various algorithms from computational science.
The course is not offered in the academic year 2017/18