Skip to main content

Aggregates Overview

AtScale incorporates the data-warehousing concept of aggregate tables. Such tables most often contain measures from one or more fact datasets and include aggregated values for these measures. (There are dimension-only aggregate tables.) The aggregation of the data is at the level of one or more dimensional attributes or, if no dimensional attributes are included, the aggregated data is a total of the values for the included measures.

With aggregate tables, query engines can improve the performance of queries that request aggregated data such as Sales Amounts by Month or Number of Orders by Country.

Without aggregate tables, query engines need to scan regular tables to return aggregated data, and those tables could have very high row counts. The engines then have to perform the calculations required to aggregate data. Both scanning and calculating are actions that are resource-intensive and time-consuming. Query engines that have no aggregate tables to use must perform these actions for each query on a measure.

Challenges with implementing and using aggregate tables

Two significant drawbacks to aggregate tables, however, are the time and effort involved in designing, testing, deploying, periodically refreshing the data in them (called rebuilding), and monitoring them. It can take data modelers weeks to design aggregate tables based on requirements gathered from stakeholders or based on query analysis performed on existing OLAP models. The tables then need to be deployed and loaded on test systems where modelers can iteratively test and revise them. After all of that comes the meticulous task of deploying the tables on production systems.

Deployed aggregate tables need constant monitoring of their utilization rates, and the response times for the queries that use them also need to be monitored. Monitoring can lead to the redesign of aggregate tables and the creation of new ones. These tasks can involve more than one team (e.g. a team of data modelers and a team extract, transform, and load the data for the test aggregate tables), and therefore require significant time and effort as tables are iteratively designed, tested, and deployed.

Finally, aggregate tables need careful rebuilding at intervals, so that they include the newest data and the most recent updates to an OLAP model. Rebuilding incorrectly can lead to query results that are inconsistent from one build to the next.

How AtScale eliminates these drawbacks

AtScale lets you take advantage of the power of aggregate tables without spending all of that time and effort normally required.

AtScale's adaptive cache technology is able to analyze the queries against your models and model metadata, determine the optimal aggregate tables for improving performance of those queries and possible future queries, and define and create aggregate tables automatically on the fly. Moreover, rather than simply building an aggregate table once and then rebuilding it over time as new data arrives or is updated in your cluster, AtScale continuously optimizes its collection of aggregate tables for changing query workloads.

Because the AtScale engine does this, you don't have to ask yourself questions such as "How do I know when I have 'enough' aggregate tables?" or "How do I know that I have the right aggregate tables?". The engine is always at work creating a sufficient number of aggregate tables that are optimized for your query workloads. All of this optimization happens after a model is published, when the model is in production and being queried by business intelligence applications.

This means that you don't have to make an up-front expenditure of time and effort to design, test, and deploy aggregate tables before your models start providing business value. After a data modeler designs and publishes a model in the AtScale Design Center, client applications can start querying it right away. AtScale's adaptive cache continuously learns about the query workloads on that model and defines and redefines aggregate tables to satisfy them.