OLAP Query Evaluation in a Database Cluster: a Performance Study on Intra-Query Parallelism.
|Title||OLAP Query Evaluation in a Database Cluster: a Performance Study on Intra-Query Parallelism.|
|Author(s)||F. Akal and K. Böhm and H.-J. Schek|
|Booktitle||Proceedings of the 6. East-European Conference on Advances in Databases and Information Systems (ADBIS'2002)Bratislava, Slovakia|
While cluster computing is well established, it is not clear how to coordinate clusters consisting of many database components in order to process high workloads. In this paper, we focus on Online Analytical Processing (OLAP) queries, i.e., relatively complex queries whose evaluation tends to be time-consuming, and we report on some observations and preliminary results of our PowerDB project in this context. We investigate how many cluster nodes should be used to evaluate an OLAP query in parallel. Moreover, we provide a classification of OLAP queries, which is used to decide, whether and how a query should be parallelized. We run extensive experiments to evaluate these query classes in quantitative terms. Our results are an important step towards a two-phase query optimizer. In the first phase, the coordination infrastructure decomposes a query into subqueries and ships them to appropriate cluster nodes. In the second phase, each cluster node optimizes and evaluates its subquery locally.