Over the past generation, data warehousing and online analytical processing (OLAP) applications have become the cornerstone of contemporary decision support environments. Typically, OLAP servers are implemented on top of either proprietary array-based storage engines (MOLAP) or as extensions to conventional relational DBMSs (ROLAP). While MOLAP systems do indeed provide impressive performance on common analytics queries, they tend to have limited scalability. Conversely, ROLAP’s table oriented model scales quite nicely, but offers mediocre performance at best relative to the MOLAP systems. In this paper, we describe a storage and indexing framework that aims to provide both MOLAP like performance and ROLAP like scalability by essentially combining some of the best features from both. Based upon a combination of R-trees and bitmap indexes, the storage engine has been integrated with a robust OLAP query engine prototype that is able to fully exploit the efficiency of the proposed storage model. Specifically, it utilizes an OLAP algebra coupled with a domain specific query optimizer, to map user queries directly to the storage and indexing framework. Experimental results demonstrate that not only does the design improve upon more naive approaches, but that it does indeed offer the potential to optimize both query performance and scalability.
Data warehousing and online analytical processing (OLAP) have been popular targets for researchers over the past 10？15 years, with papers published on a wide variety of related topics. In the OLAP domain, early work often focused on the development of algorithms for the efficient computation of the data cube. Later, the cube methods were expanded to include mechanisms for the computation or representation of hierarchies derived from the cube’s dimensions. In general, academics built upon table-based models, as the associated relational systems were well understood. On the positive side, scalability for relational OLAP (ROLAP) was very impressive and was generally only limited by the underlying hardware. Unfortunately, such systems often provided poor query performance as they were ill suited to OLAP’s complex, multi-dimensional data model.
For this reason, commercial vendors often developed proprietary array-based server products that were meant to resemble the hyper-cubic nature of the data cube more closely. Performance on these multi-dimensional OLAP (MOLAP) servers was/is indeed impressive as the direct indexing provided by arrays often leads to a much improved query response time. Of course, everything comes at a price and, in the case of MOLAP, scalability remains a concern. Specifically, the sparsity of high cardinality OLAP spaces significantly limits the size of the cube structures in enterprise environments.
In this paper, we discuss the storage architecture for an OLAP-specific server designed from the ground up as a high performance analytics engine. The system is capable of efficiently generating full or partial cubes and subsequently provides complex processing on common cube queries (slice/dice, drill down/roll up, etc.). Until recently, the database management system (DBMS) essentially relied on the file system for storage services. Some indexing was available but was quite limited in nature. Our recent work, as presented in the current paper, has significantly extended the original model to include both R-tree and bitmap indexing facilities. Specifically, we have integrated the open source Berkeley Database (DB) libraries into the server so as to encapsulate both indexes and cube data within a single data store. Cube dimensions are also efficiently stored as Berkeley DB and are, as expected, hierarchy aware. Non-hierarchical attributes, in turn, are stored as a set of FastBit bitmap indexes. Ultimately, the integrated architecture represents a very efficient OLAP storage engine that provides the kind of query performance that one would expect from a MOLAP system, with the scalability typically associated with table-oriented relational servers.
We note that the current paper significantly extends upon an earlier publication . In particular, we have expanded upon our discussion of the storage engine itself, providing a much more thorough presentation of the individual elements of the system. We also extend the presentation of the associated algebra and its relationship to the storage structures. As a result, the motivation for the underlying research should be clearer. Additional experimental results have been included as well. Finally, supplemental material has been added as a series of appendices.
This paper is organized as follows. In Section II, we discuss a number of related research projects. In Section III, we introduce the conceptual model upon which the storage engine is based. A detailed look at how data is encoded is provided in Section IV, including both its abstract and physical representation. The integration with the Berkeley DB and FastBit bitmap libraries is discussed in Section V. Then, in Section VI, we present an overview of the query processing logic that actually utilizes the relevant storage components. Finally, we round out the paper with some conclusions in Section VIII.
Subsequent to the initial definition of the data cube operator , a number of researchers proposed techniques for the compact representation of the cube. Both the Dwarf cube  and QC-trees , for example, define compact non-relational tree-based structures that provide efficient data access. However, their complex models were never integrated into practical systems, academic or otherwise. Conversely, the CURE cube  supports the representation of cubes and dimension hierarchies and does so with relatively compact table storage. Still, the CURE model lacks the native multi-dimensional indexing schemes that are essential for high performance query functionality.
In terms of OLAP indexing, a number of researchers have proposed methods that would improve warehouse access. In the simplest case, clusters of B-trees have been proposed, though such an approach is neither scalable nor efficient in higher dimensions . A more interesting proposal was perhaps the CUBE tree, a warehouse-specific mechanism based upon the R-tree . The CUBE tree not only demonstrated that the R-tree was well suited to OLAP access patterns, but also provided an efficient update mechanism.
Recently, column store databases have been investigated as a means to minimize IO costs on aggregation queries . It is important to note, however, that column stores are best suited to general purpose warehouses that perform ad hoc, real-time querying involving massive amounts of raw data. True OLAP servers, often working in conjunction with a supporting warehouse (possibly a column store), typically use a combination of pre-aggregation, specialized indexing, and query optimization to target the most common query forms. In practice, OLAP severs and column store DBMSs can be seen as complementary rather than competitive.
A second recent theme has been the exploitation of the increasingly popular MapReduce framework ？and its open source implementation Hadoop？as a kind of parallel DBMS subsystem. Integration of Hadoop and traditional relational DBMSs (with their storage and indexing architectures) has also been suggested . While this work is indeed interesting, there remains considerable doubt as to whether such systems can compete directly with the performance offered by purpose-built DW/OLAP servers . At present, MapReduce applications may be best suited to extract-transform-load (ETL) processes or those with large but limited analysis requirements.
Finally, we note that
Before examining the storage framework itself, we briefly discuss the model the DBMS is meant to represent. We consider analytical environments to consist of one or more
and one or more
In practice, the cube is typically modeled as a
We now examine how our DBMS actually encodes the contents of the database. For illustrative purposes, we will continue to build on the schema depicted in Fig. 1b. We will assume as well that the actual data associated with the warehouse is consistent with that illustrated in Fig. 2. Note that the ProductNumber, CustomerNumber, and EmployeeNumber values in the Fact table are taken from the primary key columns in the associated Dimension tables. Also, note as well that the Fact table would in practice be hundreds or even thousands of times larger than the Dimension tables.
We begin with an overview of Dimension encoding (i.e., Customers, Employees, and Products). We note that Dimension attributes can be described as either
any hierarchy？is relatively straight-forward and is accomplished with a linear pass through the native data set that simply assigns an incremental
Encoding of hierarchical values is more involved and is based upon the notion of
where the array index notation [ ] indicates a specific value within a given hierarchy level. Informally, we can say that if a hierarchy is linear, there is a contiguous range of values
The DBMS exploits hierarchy linearity by using
For each hierarchical attribute level
Once the Dimension tables have been processed, a preliminary
encoding of the Fact data can be undertaken. As with any Fact table, its schema essentially consists of one or more measure attributes, and a set of feature attributes. The feature attributes are, of course, nothing more than the dimensional surrogate keys previously defined. We note at this point that data in the Fact table is represented at the most granular or detailed level of each dimension. As we will see, the DBMS exploits this feature to allow efficient run-time transformations to arbitrarily defined aggregation levels. As well, one must bear in mind that this is still an abstract depiction of the Fact table. Its concrete representation will be discussed below.
A simple example of a Fact structure is shown in Fig. 4a, assuming a single “total sales” measure. For the sake of clarity, the records are listed in sorted order, using a Product-Employee-Customer scheme. In practice, of course, partial aggregates or group-by scan also be computed so as to minimize processing time on common user queries. Fig. 4b depicts the associated Product-Employee cuboid, illustrating how the Customer information is integrated into the more compact summary view.
We must now consider how these Dimension tables are
physically stored on a disk. We address hierarchical attributes first. Once the mapping tables have been defined, the encoded values are stored to a disk using the open source Berkeley DB libraries . Specifically, for each subattribute in the hierarchy, we create a simple Berkeley database using the Recno access method. The Recno access method is backed with a flat-text file in (
At the DBMS startup, the Berkeley Recno databases are used to initialize a data structure called a mapGraph . Fig. 5b presents a mapGraph sub-structure called a
NON-HIERARCHICAL ATTRIBUTES: While the transparent mapping of hierarchical attribute values is crucial for optimal query performance, non-hierarchical attribute processing must also be efficiently supported. In particular, if a non-hierarchical attribute is used in the restriction of an OLAP query or displayed in an OLAP report (e.g., “All customers older than 40”), then joins between the appropriate group-bys and Dimension tables are required. This process can be very expensive thus its costs must be minimized.
Our DBMS utilizes bitmap indexes for this purpose. For each indexed non-hierarchical attribute we provide a one bit string for each distinct value on the dimension. For
While the Dimension tables, and their indexes in particular, are involved in the resolution of virtually every query, the bulk of both the raw IO and post-processing is associated with the enormous Fact Structures. Earlier, we saw how the DBMS associates each Dimension table record with a surrogate key. It is these integer values, along with the associated measures (e.g., a total sales summation), that are housed within the Fact Structure. Physically, the DBMS stores and indexes data using what is
known as a
space (for side length
origin (note that point 226 would reference
With the first point serving as the
It should be clear that the physical representation of the Fact data, while conceptually encoded as a table of records (i.e., ROLAP), bears little resemblance to a traditional table. In essence, it is a block-based collection of compacted bit strings, each representing a specific point in the multi-dimensional data space. Moreover, its supporting Hilbert R-tree index provides rapid retrieval of points in contiguous ranges. In short, it blurs the line between ROLAP and MOLAP by providing some of the best features of both.
In practice, the DBMS allows the administrator to either fully or partially materialize the
For this reason, we have chosen to embed the Berkeley DB libraries within the larger DBMS framework. While the Berkeley API offers a number of indexing methods (B-tree, Hash, Recno, Queue), it provides no direct support for R-trees. As such, we have extended the Berkeley C++ interface to allow for the creation and access of Hilbert packed R-trees using standard Berkeley protocols. We note that Berkeley supports the storage of multiple
Fig. 7b, on the other hand, provides a more complete illustration of the core components for the storage engine, albeit for a very simple example with just two dimensions. One can see how the Dimension tables are accompanied by a (memory-resident)
While the table storage and indexing components are the focus of the current paper, we note that the DBMS as a whole provides a relatively comprehensive processing stack. Fig. 8a illustrates the full architecture, including the native components as well as the Berkeley extensions. Note that the View Manager is responsible for the identification of the most cost effective group-by？and is initialized by scanning the primary Master B-tree database that contains references to all indexed group-bys？ while the Hierarchy Manager builds and maintains the inmemory mapGraph structures. Furthermore, note that OLAP Caching has nothing to do with the Berkeley caching component that stores recently accessed disk blocks, but refers instead to a native, multi-dimensional OLAP query cache.
Finally, we note that the DBMS is in fact a fully parallelized architecture. Specifically, the server has been constructed from the ground up as a “shared nothing” cluster data management system. The processing stack depicted in Fig. 8a, in fact, depicts the software components residing on each node of the cluster. Referred to as
The DBMS provides full query processing functionality. In other words, it must not only parse incoming queries, but it must also decompose those queries into their constituent elements and optimize these operations relative to the supporting indexes and data structures. While the full details of the optimization process are beyond the scope of this paper (and are more fully discussed in ), it is nonetheless important in the current context to understand how the storage and indexing facilities are integrating into the query engine. In this section, we examine the logic of the query resolution process as it relates to the work presented above.
To begin, we note that the data cube depicted in Fig. 1a can, in fact, be interpreted as a conceptual data model for the OLAP domain. In other words, both its structure (cells, dimensions, and hierarchies) and the operations associated with them (slicing and dicing dimensions, as well as rolling up and drilling down on hierarchies) are directly representative of the intuitive query environment envisioned by end users. Due to the consistency of this model across the OLAP domain, a number of researchers have identified the core operations of a supporting OLAP algebra, including formal analysis showing the algebra to be both closed and complete  (note that these operations represent a read-only
SELECTION (σp cube): the identification of one or more cells from within the full d-dimensional search space, providing basic slice and dice functionality.
PROJECTION (πattribute1,...,attributen cube): the identification of presentation attributes, including both measure attributes and feature attributes.
DRILL ACROSS (cube1 ∞ cube2): the integration of two independent cubes, where each cube possesses common dimensional axes (effectively a cube “join”).
the modification of the granularity of aggregation. This process is typically referred to as “drill down” and “roll up”.
CHANGE BASE (χbase1→base2): addition or deletion of one or more dimensions from the current result set. Aggregated cell values must be recalculated accordingly.
PIVOT (？base): rotation of the cube axes to provide an alternate perspective of the cube. No recalculation of cell values is required.
UNION (cube1 ∪ cube2): union of two cubes sharing common dimensional axes.
INTERSECTION (cube1 ∩ cube2): intersection of two cubes sharing common dimensional axes.
DIFFERENCE (cube1 ？ cube2): difference of two cubes sharing common dimensional axes.
The Sidera DBMS takes full advantage of the OLAPspecific algebra. Specifically, its query optimizer is designed to accept and optimize native OLAP operations (as described in the next section). This implies, of course, that queries are delivered in a form that is amenable to algebraic representation. The “obvious” way accomplish this would be to translate SQL/MDX to the core operations of the algebra, as is done by conventional servers (i.e., from SQL to the operations of the relational algebra). Sidera, however, uses an entirely different approach, one that is based upon the notion of native language queries. In other words, queries are defined in the application language (e.g., Java) rather than an embedded query language such as SQL. Ultimately, such a mechanism allows for 1) compile time syntactic and semantic checking, 2) rich refactoring opportunities, 3) the use of Object Oriented features such as query inheritance, and 4) the utilization of a single programming API for both the application and the database backend.
The details of the native language model？known as native language OLAP query eXecution (NOX)？have been described in a previous publication . However, it is important in the current context to at least understand the relationship between the query representation, the optimizer, and the storage engine. We therefore begin with a (very) concise depiction of the NOX query specification and resolution process. Programmers define OLAP queries via a series of Query Object classes provided as part of the NOX API. The
Listing 1 provides a partial representation of atypical OLAP query？the quantity of items ordered by customers over the age of 40, between May and October of 2007, with the result grouped by Product, Type, and Province？ along with a small
Whether the query is generated using the NOX API or is constructed as a translation of a traditional SQL or MDX query, it will eventually be sent to the query optimizer as a sequence of algebraic operations. Listing 2 in Appendix A depicts the translated query (in XML format) associated with Listing 1. Once the query arrives in this form, it is ready to be optimized relative to the structures provided by the storage engine. Algorithm 1 is a somewhat simplified representation of the core logic
implemented by the query engine. Queries are transmitted to/from the end user and verified syntactically against the grammar (i.e., XML Schema). Valid queries are then evaluated for semantic correctness to ensure that they comply with the database schema. If so, an algebraic plan is generated, optimized via a set of
We now turn to the logic implemented by the second FOR loop; that is, the actual data access methods. While each algebraic operator is associated with a distinct set of implementation functions, we will focus here on SELECTION as it is arguably the most important and expensive of the core operations. Given the underlying indexes and storage structures, it is the SELECTION algorithms job to map the user’s query constraints to the Dimension and Fact structures. This happens in two stages. First, hierarchical
and non-hierarchical query attributes are converted as required into the base level attributes found in the Fact table. Algorithm 2 describes this process. Using either the mapGraph Hierarchy Manager or the FastBit bitmap indexes, ranges of contiguous base level IDs are extracted. Logical AND or OR operations are applied as required. The end result is an ordered list of Dimension record IDs (i.e., surrogate keys) that may be passed as input to the Fact Structure search algorithm.
Once the DimensionID lists have been generated, they are passed to the cube storage engine to be matched against the Hilbert ordinals of the Fact Structure (note that the View Manager transparently selects the most cost effective group within the Berkeley DB). Given an ordered list of
a breadth first search is able to make a single pass through the table, incrementally adding relevant block IDs to the result list (while it is not obvious in the illustration, the levels of the R-tree index are physically ordered on a disk in this same root-to-leaf fashion). Moreover, because of the explicit Hilbert ordering of data, target records tend to be clustered into a small number of disk blocks. In fact, even when selectivity is very high, the combination of Hilbert ordering and breadth first search implies that, in the worst case, Fact Structure access can be no worse than a sequential scan (and is typically much better).
We now turn to the cost of the SELECTION operation, in terms of its exploitation of the underlying data and storage structures.
THEOREM 1. The cost of the SELECTION operator is bounded as the cost of sequentially scanning B(V) and D(V), where V is the appropriate packed R-tree index to answer the SELECTION, B(V) is the number of index blocks, and D(V) is the number of disk blocks. Cost = B(V) + D(V) I/O.
Of course, processor time should also be considered.
THEOREM 2. The worst case processor running time of the SELECTION operator has a bound of
The full cost of the SELECTION algorithm can be represented as the sum of (a) the disk I/O and (b) the processor running time. We observe, however, that for most queries the number of disk I/O dominates the processor time.
We now turn to the effectiveness of the integrated storage engine. To begin, we note that all evaluations, unless otherwise indicated, are conducted on a Linux-based workstation running a standard copy of the 2.6.x kernel, with 8 GB of main memory and a 3.2 GHz CPU. Disks are 160 GB SATA drives operating at 7200 RPM. The Berkeley DB components are taken from version db4.7.25. Data sets are generated using a custom data generator developed specifically for this environment. We first generate a multi-dimensional Fact table (the dimension count varies with the particular test), with cardinalities arbitrarily chosen in the range of 2？10000. Depending on the test involved, row counts typically vary from 100,000 to 10 million records. The primary Fact tables are then used to compute fully materialized data cubes containing hundreds of additional views or cuboids. For example, a 10- dimensional input set of 1,000,000 records produced a data cube of 1024 views and approximately 120 million total records. Once the cubes are materialized, we index the data using the R-tree and bitmap mechanisms.
Since individual millisecond-scale queries cannot be accurately timed, we use the standard approach of timing queries in the batch mode. In the succeeding tests, five batches of queries are generated and the average run-time is computed for each plotted point. Since query benchmarks are not well standardized for OLAP (the OLAP APB benchmark is effectively dead and TPC-H is better suited to long running, ad hoc warehouse queries), we define our own query classes (Appendix A？C). The queries themselves are typically written in SQL and then translated to a comparable XML representation as required. Finally, we note that when evaluating query performance, we use the “drop caches” option available in the newer Linux kernels to delete the OS page cache between runs.
We begin with a comparison of the FastBit indexing subsystem for non-hierarchical attributes versus clusters of standard B-trees (implemented by Berkeley DB). We create a dimension (called Customer) with five nonhierarchical attributes (Age, FirstName, LastName, Balance, and Nationality) and 1,000,000 records (i.e., the cardinality of the primary key CustomerID). The cardinalities of the non-hierarchical attributes were arbitrarily chosen in the range of 100？1000.
We constructed 3 sets of queries against the Customer dimension, with each set containing five queries. The SQL format of two sample queries from each category is given in Fig. 10a. We can see, for example, that
only look-up queries on a single non-hierarchical attribute.
Fig. 10b shows a comparison of the running time using the two indexing implementations. For the first set (simple look-up on one attribute), we can see that the running times are actually quite similar. However, when we move to the more complex queries in
As previously noted, one of the advantages regarding the use of the Berkeley libraries is that its
In the first test, the full cube (2
running time for index cube construction before and after Berkeley DB integration. On average, the integration of Berkeley into our server reduces the index cube construction time by 40% to 60%. The primary reason for this reduction in time is that because the new method uses a single, integrated DB repository, its contiguous block layout allows for very efficient IO, even on larger R-trees. Conversely, use of multiple OS files leads to considerable disk thrashing.
An increase in Dimension count has a similar impact in that each additional dimension effectively doubles the number of views to be computed and stored. In Fig. 11b, we see the results for a dataset of 1 million records and dimension counts of 5, 7, and 9 (common dimension counts in many OLAP environments). Again, we observe that the running time when using Berkeley DB drops by 40% to 60% due to the fact that we are storing the indexed cube in one contiguous physical file.
In this case, we create a cube from an input set of 1 million records, 9 non-hierarchical dimensions, and mixed cardinalities in the range of 100？10000, with the full cube representing over 200 million records and 12 Gigabytes of total data. We generate the Hilbert R-tree indexed cube in the following two ways: 1) the Berkeley supported Sidera DBMS in one physical file, and 2) the “standard” Sidera server in 1,024 files (2 files per view). We then use our query generator to generate batches of non-hierarchical queries. By non-hierarchical queries, we mean those queries whose ranges have been restricted to the base attribute.
Fig. 12 demonstrates the total response time for nonhierarchical queries in the Berkeley supported Sidera query engine versus the original Sidera query engine. Results are shown for batches of 100, 500, and 1000 nonhierarchical OLAP queries. The graph shows the improvement
from the Berkeley DB integration. Specifically, in all three cases, the integration of the Berkeley code into our Sidera DBMS reduces the OLAP query resolution time by 15% to 20%.
As previously noted, the mapGraph module supports various hierarchical dimension patterns (e.g., symmetric and ragged hierarchies) . It is therefore important to compare the performance of the Berkeley supported Sidera DBMS against the current Sidera engine in resolving hierarchical OLAP queries. In this case, we create 9 dimension hierarchies made up of a mixture of symmetric strict, ragged strict and non-strict forms (note that the primary feature of ragged and non-strict hierarchies is that they include optional nodes within the hierarchy graph). We employ batches of 1000 OLAP queries, this time in hierarchical form only. Fig. 13 shows the running time for data sets ranging in size from 10,000 to 1,000,000 records. There are two points that must be made with respect to a direct interpretation of the results. First, answering OLAP hierarchical queries in the Berkeley supported Sidera query engine is faster than the old Sidera query engine by an average of 15%. Second, the total overhead is less than 25% relative to the non-hierarchical case described in the previous section. In short, the integration of the mapGraph manager into Berkeley DB allows it to answer hierarchical OLAP queries with a modest overhead compared to the non-hierarchical case. We note that this overhead is more than acceptable given the power and flexibility that the graph manager provides.
While indexes are generally quite beneficial when selectivity is low (i.e., a small percentages of records is accessed),
performance typically deteriorates quickly as selectivity rises due to the effects of disk thrashing. In fact, selectivity levels beyond 5% of the base dataset typically result in a simple scan out-performing conventional indexes. In practice, this is quite important in OLAP settings as query selectivity can often be quite high for aggregate-based requests. As noted in Section VI-E, however, Sidera attempts to minimize the effects of high selectivity with the combination of a linear breadth first search and a corresponding disk layout strategy. In this test, we therefore look at the running time of queries running against the Rtree indexed data sets versus those resolved via a direct sequential scan. Specifically, we use 12 OLAP queries (equivalent to those queries in Appendix A) to compare the two resolution strategies in the case of both 4-dimensional and 6-dimensional cubes generated from a Fact table of 10 million records. We can see in Fig. 14 that the sequential scanning approach takes roughly four times longer than the R-tree variation. We note, of course, that there is a point beyond which no index can improve upon a sequential scan. In our server this happens when the result of the query exceeds 20% to 25% of the records in the view that is used to answer the query. However the penalty associated with pathologically large queries is small？less the 5%？because of the indexing/search mechanism employed by the DBMS. In practice, this is important not only from a pure performance perspective but also because it significantly minimizes the complexity of query execution planning (i.e., the indexes can be used in
The query optimization process builds on a series of algebraic transformation rules that have been optimized for the data and storage structures discussed in this paper. Specifically, the optimizer module manipulates the original query parse tree so as to minimize the cost of query resolution. While the details of the process are discussed in an earlier paper , we note that the primary SELECTION
transformations include pushing selection operations down the parse tree (i.e., towards the source data sets) and the integration of independent selection predicates. To evaluate the benefit of the optimization process, we execute our query test suite against both the optimized and un optimized versions of the DBMS, for input data sets of one million and 10 million records. Fig. 15a provides the test results. Here, the elapsed time represents the cumulative time for the full test set. In short, execution time for the un-optimized backend, which can be interpreted as a DBMS that is not able to explo it the physical structures presented in the paper, is roughly an order of magnitude longer than the case for the fully optimized DBMS.
As noted earlier in the paper, the OLAP algebra consists of a series of core operators, each of which can be
optimized independently. For the sake of completeness, we present optimization results for the Projection operator as well, arguably the next most significant OLAP operation. Again, transformation rules (as per ) are used to minimize the processing cost of the physical projection operation. Fig. 15b illustrates the performance of the query engine as the number of records in the Fact table varies from one million to 10 million records. The results are in fact quite similar to those obtained when evaluating the Selection operator. Specifically, for un-optimized processing, we see a performance penalty that approaches a factor of ten relative to the fully optimized model.
Of course, the ultimate purpose of a DBMS server is to provide impressive query performance. While it would be possible to only test the system against an artificially defined baseline, a more meaningful comparison can be
made against existing DBMS servers. Therefore, we have also evaluated the DBMS relative to systems often used in industrial database environments, namely the open source MySQL server (the “lower end”) and Microsoft Analysis Services (the “higher end”). In this case, we generate a 6-dimensional, 10-million record database (using the dimension characteristics described previously) and load it into both DBMS platforms in the standard Star Schema format (the Microsoft server was installed on a Windows partition on the same workstation). Queries were prepared in both SQL format and the native XML form of our own OLAP DBMS. Fig. 16 shows comparative results for both platforms and demonstrates that the MySQL server takes approximately 10？15 times as long to resolve the same queries, while Microsoft Analysis Services？running in the ROLAP mode？is three to six times slower. Note that the term “Sibling Server” refers to a single node of our parallel DBMS.
One can argue, of course, that MOLAP offers superior performance to ROLAP configurations, at least for data sets of this size. So we loaded the same Star Schema data using the MOLAP mode of Microsoft Analysis Services. Fig. 17a shows that MOLAP does indeed outperform our OLAP DBMS by a factor of approximately 5 to 1. However, we note that in this test, our DBMS was not permitted to materialize any additional data; it was essentially just an efficient Star Schema. In Fig. 17b, we see the result once aggregate materialization is added to the Fact Structure (note that production systems would typically use partial cube materialization consisting of the Fact data and a set of low-dimensional group-bys. In practice, this produces a compressed cube database that is not much bigger than the original Fact and Dimension tables). While Microsoft MOLAP server still has a slight advantage, we note that 1) the Microsoft DBMS benefits from years of optimization, and 2) MOLAP is ideally suited to the scale of the current test (i.e., 1？10 million records). Given that our DBMS framework is not constrained by the limits of array-based storage , these preliminary results suggest that the current DBMS？and the architecture it represents？does indeed have the potential to provide MOLAP-style performance with ROLAP-style scalability (we note that a number of legacy components in the codebase currently prevent true Terabyte scale testing. However, an ongoing software re-engineering effort is expected to remove these limitations in the coming year).
In this paper, we described the storage and indexing architecture of a high performance OLAP DBMS. Current OLAP DBMS platforms generally take an “either/or” (MOLAP/ROLAP) approach to data representation and query processing, with the result being a very clear trade-off between the scalability of relational systems and the performance of array-based platforms. The DBMS described in this paper attempts to build on the best features of both. Specifically, it uses a Fact Structure storage model that is constrained primarily by the disk space available, rather than the sparsity of the cube space. At the same time, the use of compressed Hilbert ordered Rtree indexes, mapGraph mapping tables for hierarchical attributes, and bitmap indexes on non-hierarchical attributes, coupled with a linearized Fact Structure search strategy, produces query performance beyond what one would expect with relational systems. In addition, the DBMS includes a robust query optimization engine that integrates the core storage and indexing facilities with an algebra designed specifically for OLAP environments. Furthermore, a series of experiments confirmed that not only are the storage structures compact (and easily administered), but that query performance is actually comparable to commercial, and far less scalable, MOLAP servers. Given the enormous size of both existing and projected warehouses, we believe that the principles presented in the current paper offer great potential for the OLAP servers of the future.
A. XML Query Format
The DBMS prototype has been integrated with a native language query interface. Queries written in the client side language (e.g., Java) are translated at compile time into an OLAP-specific grammar and passed to the DBMS at run-time. The grammar is encoded in XML. Listing 2 shows a simple XML query corresponding to the client side query of Listing 1.
B. Evaluation Schema
The test environment utilizes the familiar Star Schema format, in which multiple dimensions represent the primary entities of the organization. The specification for six common dimensions is listed below.
Size: 1,000,000 records
Hierarchy: C_ID → Region → Country
Size: 200,000 records
Hierarchy: P_ID → Type → Category
Size: 3,650 records
Hierarchy: DayName → Month → Quarter → Year
Size: 655 records
Hierarchy: StoreName → StoreCity → StoreState → StoreCountry
Size: 416 records
Hierarchy: VendorNumber → City → StateName → CountryName
Size: 300 records
Hierarchy: No hierarchy
C. Sample Queries
Evaluation queries have been designed so as to represent common OLAP query forms. Specifically, database access emphasizes grouping operations, slice and dice analysis, and hierarchical navigation. A series of test samples are included below.
1. SELECT c.Region, SUM(s.Tolal_Sales)
FROM customer as c, sales as s
WHERE s.C_ID = c.C_ID and c.Age = 50 and c.Region = ‘Quebec’
GROUP BY c.Region
2. SELECT t.Month, SUM(s.Tolal_Sales)
FROM time as t, sales as s
WHERE s.T_ID = t.t_ID and t.Year = 2005 and DayName = ‘Monday’ and t.Quarter = ‘Q1’
GROUP BY t.Month
3. SELECT p.Type, SUM(s.Tolal Sales)
FROM product as p, sales as s
WHERE s.P_ID = p.P_ID and p.ProdDesc = ‘Urna’ and p.Category = ‘Automotive’ and p.Quantity = 200
GROUP BY p.Type
4. SELECT s.StoreCity, t.Month, SUM(ss.Tolal Sales)
FROM time as t, store as s, sales as ss
WHERE t.T_ID = ss.T_ID and ss.S_ID = s.S_ID and ((t.year = 2005 and t.DayName = ‘Monday’) and (t.Quarter = ‘Q1’ or t.Quarter = ‘Q2’)) and s.StoreState = ‘Ontario’
GROUP BY s.StoreCity, t.Month
5. SELECT c.Region, p.Category, SUM(ss.Tolal Sales)
FROM customer as c, product as p, sales as ss
WHERE ss.C_ID = c.C_ID and ss.P_ID = p.P_ID and c.Age = 40 and c.Country = ‘Canada’ and p.Quantity = 200 and p.Category = ‘Automotive’
GROUP BY c.Region, p.Category
6. SELECT c.country, t.Month, SUm(ss.Tolal Sales)
FROM customer as c, time as t, sales as ss
WHERE ss.C ID = c.C ID and ss.T ID = t.T ID and (((t.year = 2005 and t.DayName = ‘Monday’) and (t.Month = ‘May’ or t.Month = ‘June’)) and (c.Age = 40 and c.Region = ‘Quebec’))
GROUP BY t.Month,c.country
7. SELECT c.Region, p.Type, s.StoreCity, SUM(ss.Tolal Sales)
FROM customer as c, product as p, store as s, sales as ss
WHERE ss.C_ID = c.C_ID and ss.S_ID = s.S_ID and ss.P_ID = p.P_ID and c.Region = ‘Ontario’ and s.StoreState = ‘Ontario’ and p.Category = ‘Household’
GROUP BY c.Region, p.Type, s.StoreCity
8. SELECT s.StoreCity, t.Month, SUM(ss.Tolal Sales)
FROM time as t,customer as c, store as s, sales as ss
WHERE t.T_ID = ss.T_ID and ss.C_ID = c.C_ID and ss.S_ID = s.S_ID and (t.year = 2005 and t.DayName = ‘Monday’) and (c.Age = 40 and c.Region = ‘Quebec’) and s.StoreState = ‘Ontario’
GROUP BY s.StoreCity, t.Month
9. SELECT t.Quarter, p.Type, s.StoreCity, SUM(ss.Tolal Sales)
FROM time as t, product as p, store as s, sales as ss
WHERE t.T_ID = ss.T_ID and ss.S_ID = s.S_ID and ss.P_ID = p.P_ID and t.Year = 2005 and t.Quarter = ‘Q1’ and s.StoreState = ‘On-tario’ and p.Category = ‘Household’
GROUP BY t.Quarter, p.Type, s.StoreCity
10. SELECT t.Quarter, c.Region, p.Type, s.StoreCity, SUM(ss.Tolal Sales)
FROM time as t, customer as c, product as p, store as s, sales as ss
WHERE t.T_ID = ss.T_ID and ss.C_ID = c.C_ID and ss.S_ID = s.S_ID and ss.P_ID = p.P_ID and c.Region = ‘Ontario’ and s.StoreState = ‘Ontario’ and p.Category = ‘Household’
GROUP BY c.Region, p.Type, s.StoreCity, t.Quarter
11. SELECT p.Type, s.StoreState, SUM(ss.Tolal Sales)
FROM time as t, customer as c, product as p, store as s, sales as ss
WHERE t.T ID = ss.T_ID and ss.C_ID = c.C_ID and ss.S_ID = s.S_ID and ss.P_ID = p.P_ID and c.Region = ‘Ontario’ and t.Quarter = ‘Q1’ and p.Category = ‘Household’
GROUP BY p.Type, s.StoreState
12. SELECT t.Quarter, SUM(ss.Tolal Sales)
FROM time as t,customer as c, product as p, store as s, sales as ss
WHERE t.T_ID = ss.T_ID and ss.C_ID = c.C_ID and ss.S_ID = s.S_ID and ss.P_ID = p.P_ID and c.Age = 40 and t.year = 2005 and s.StoreState = ‘Ontario’ and p.Type = ‘Engine’
GROUP BY t.Quarter