03-09-2009, 05:04 PM
On-line Analytical Processing (OLAP) has become a fundamental component of contemporary decision support systems and represents a means by which knowledge workers can efficiently analyze vast amounts of organizational data. Within the OLAP context, one of the more interesting recent themes has been the computation and manipulation of the data cube, a relational model that can be used to represent summarized multi-dimensional views of massive data warehousing archives.
Over the past five or six years a number of efficient sequential algorithms for data cube construction have been presented. Given the size of the underlying data sets, however, it is perhaps surprising that relatively little effort has been expended on the design of load balanced, communication efficient algorithms for the parallelization of the data cube. Our current research investigates opportunities for high performance data cube computation, with a particular emphasis upon contemporary parallel architectures and relational database environments. In this talk, new parallel algorithms for the computation of both the complete data cube and the partial data cube will be presented. In addition, a model for distributed multi-dimensional indexing is proposed. The associated parallel query engine not only supports efficient range queries, but query resolution on non-materialized views and views containing hierarchical attributes as well. Key design features of the physical architecture will also be discussed