Providing A Compiler Technology-Based Alternative For Big Data Application Infrastructures

03/02/2022
by   K. F. D. Rietveld, et al.
0

The unprecedented growth of data volumes has caused traditional approaches to computing to be re-evaluated. This has started a transition towards the use of very large-scale clusters of commodity hardware and has given rise to the development of many new languages and paradigms for data processing and analysis. In this paper, we propose a compiler technology-based alternative to the development of many different Big Data application infrastructures. Key to this approach is the development of a single intermediate representation that enables the integration of compiler optimization and query optimization, and the re-use of many traditional compiler techniques for parallelization such as data distribution and loop scheduling. We show how the single intermediate can act as a generic intermediate for Big Data languages by mapping SQL and MapReduce onto this intermediate.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset