Project R-9422

Title

Optimizing advanced analytic tasks over distributed data (Research)

Abstract

In the era of big data, companies and scientific institutions are facing data that comes in varieties and volumes never encountered before. At the same time, new needs and expectations exist about the insight and intelligence that can be derived from these datasets using predictive analytics via statistical and machine-learning models and algorithms. While sampling has been a common used technique to bridge the gap between large datasets and deep analytics via expert tools, today, driven by cheap storage and processing capacity, a huge desire exists to use the entire dataset to leverage value in the most refined and holistic way possible. In this proposal, we focus on the support of advanced big data analytics by a new generation of distributed query engines. Here the term big data analytics is used as an umbrella term for complex tasks that combine traditional query operations, like table joins, and operations from linear algebra, like matrix multiplication. In particular, we aim to support big data analytics from a database perspective, where a distributed query engine provides a solid supporting environment for effective computation and optimization of typical advanced analytic tasks. The overall goal of this project is to contribute to a better fundamental understanding of how complex data analytic workflows can be executed in a big data setting, where distribution and parallelization are key.

Period of project

01 January 2019 - 31 December 2022

Project R-9422

Title

Abstract

Period of project

Information for

Programmes

Hasselt University

Tools

Project R-9422

Title

Abstract

Period of project

Information for

Programmes

Hasselt University

Tools

Social Media