Project R-8196

Title

A declarative approach to optimizing massively parallel data processing (Research)

Abstract

Database research has witnessed a renewed interest for parallel data processing. While distributed and parallel data management systems have been around for quite some time, it is the rise of cloud computing and the advent of big data that present new challenges. Nowadays, parallelism is not restricted to a handful of servers, but is massive ranging from hundreds to tens of thousands of computing nodes. Queries are not limited to simple keyword search but involve complex join queries over multiple database tables in support of large-scale data analytics. Furthermore, performance is no longer dominated by the number of I/O requests to external memory as in traditional systems but by the communication cost for reshuffling data over the network during query execution. The latter calls for novel techniques for analyzing and optimizing complex queries in the massively parallel setting. Unfortunately, the rise of many different systems each with their own characteristics has led to a divergence of ad-hoc specialized techniques that are difficult to transfer between different systems. In this work, I want to develop a uniform approach towards optimization of queries in massively parallel systems. In particular, my research proposal has the following objectives: (1) develop a declarative framework for massively parallel data processing; (2) study decision problems in support of static analysis of queries in this framework; (3) develop general techniques for multi query optimization.

Period of project

01 October 2017 - 30 September 2019

Project R-8196

Title

Abstract

Period of project

Information for

Programmes

Hasselt University

Tools

Project R-8196

Title

Abstract

Period of project

Information for

Programmes

Hasselt University

Tools

Social Media