Towards dynamic SQL compilation in Apache Spark

Additional information

Authors

Schiavio F., Bonetta D., Binder W.

Type

Article in conference proceedings

Year

2020

Language

English

Abstract

Big-data systems have gained significant momentum, and Apache Spark is becoming a de-facto standard for modern data analytics. Spark relies on code generation to optimize the execution performance of SQL queries on a variety of data sources. Despite its already efficient runtime, Spark’s code generation suffers from significant runtime overheads related to data de-serialization during query execution. Such performance penalty can be significant, especially when applications operate on human-readable data formats such as CSV or JSON.

Keywords

Apache Spark SQL, SQL Compilation, Dynamic Compilation

Conference proceedings

Companion Proceedings of the 4th International Conference on the Art, Science, and Engineering of Programming ( Companion), March 23–26, 2020, Porto, Portugal

Pages (or article number)

4 p.

DOI

10.1145/3397537.3397566

Diffusion

License

License undefined

Visibility

Public

Status open access

Green

Università della
Svizzera italiana
Via Buffi 13
6900 Lugano, Svizzera
tel +41 58 666 40 00
e-mail info@usi.ch
Other contacts
Feedback on the website

People

Education

Research

Organisation

Towards dynamic SQL compilation in Apache Spark

Additional information

Diffusion

Faculties

Open access

Maps and directions

Stay in touch