Multicore Performance Engineering of Sparse Triangular Solves Using a Modified Roofline Model
Additional information
Authors
Wittmann M.,
Hager G.,
Janalík R.,
Lanzer M.,
Klawoon A.,
Rheinbach O.,
Schenk O.,
Wellein G.
Type
Article in conference proceedings
Year
2018
Language
English
Abstract
The Roofline model is widely used to visualize the performance
of executed code together with the upper performance bounds given by
the memory bandwidth and the processor peak performance. The model
can hereby provide an insightful visualization of bottlenecks. In this
paper, the Roofline model is applied, in a modified form, to the sparse
triangular solve step of PARDISO, a leading sparse direct solver package,
which is also part of the Intel MKL library. The performance of the
forward and backward substitution process is analyzed and benchmarked
for a representative set of sparse matrices on seven modern x86-type
multicore architectures and the Knights Landing manycore architecture.
It is shown how to accurately measure the necessary quantities also for
threaded code, and the measurement approach, its validation, as well as
limitations are discussed. Furthermore, a modified version of the Roofline
model is introduced covering the serial and parallel execution phases
allowing for in-socket performance predictions.
Conference proceedings
Proceedings of the 30th IEEE International Symposium on Computer, Architecture and High Performance Computing
Publisher
IEEE Xplore Proceedings
Meeting name
SBAC-PAD 2018
Meeting place
École Normale Supérieure, Lyon, France
Meeting date
September 24-27, 2018