Skip to main content

Enabling In-Memory Dimensional Aggregates

In-memory dimensional aggregates are dimension-only aggregate tables that are loaded and queried in memory. These can be used to improve response times for filtered dimensional lookup queries.

Note

To enable in-memory dimensional aggregates, you must have a special AtScale license. For more information, reach out to AtScale Support.

At a high level, the process for enabling in-memory dimensional aggregates is as follows:

  1. Enable the functionality in your settings.
  2. Update your UDAs to pin them to the local in-memory cache. This ​​ensures the system attempts to load dimension-only UDAs into memory.
  3. If you plan on using this functionality with row security objects, update your UDAs and row security objects to enable them to join.

The following sections describe these steps in detail.

Configure AtScale to use in-memory dimensional aggregates

To configure AtScale to use in-memory dimensional aggregates, enable the following settings. For information on changing settings, see Configuring Global Settings and Configuring Model Settings.

  • aggregates.tableCache.enabled (global level): Configures the in-memory dimension-only cache functionality. When enabled, the system automatically loads dimension-only aggregates if the in-memory aggregate pool size is small enough.
  • aggregates.dimensional.allowJoinsToSecondaries.enabled (global and model levels): Optional. Enables joining to secondary in-memory dimension aggregate tables. This must be set to true to join to in-memory aggregates for row security objects.

Pin UDAs to the local in-memory cache

To pin your UDAs to the in-memory cache, add the caching: engine-memory property to their SML definitions. For example:

unique_name: internet_sales
object_type: model
label: Internet Sales
aggregates:
- unique_name: Color
label: Color
attributes:
- name: Color
dimension: Color Dimension
caching: engine-memory

This property tells the cache to include the aggregate, regardless of size and current cache contents. Once the UDA is built, each AtScale container loads it into memory. The system automatically reloads the aggregate table when it changes.

For more information, refer to the AtScale SML Object Documentation on GitHub.

Configure joins with row security objects

To enable in-memory dimension-only aggregate tables to join with aggregate tables for row security objects:

  1. If you haven't already, enable the aggregates.dimensional.allowJoinsToSecondaries.enabled global setting. For more information, see Configuring Global Settings.
  2. On the row security object, set Lookup Rules to Use Filter Key. For more information, see Creating a Row Security Object.
  3. Create a UDA that contains the row security object and its filter attribute key. Be sure to include the caching: engine-memory property in its SML definition. For instructions on creating UDAs, see Defining Aggregates Yourself.
  4. Create another UDA that contains only the row security object's attribute filter key. Be sure to add the caching: engine-memory property to its SML.

Tuning the cache​

You can use the following global settings to configure the in-memory dimensional aggregate cache. For more information on configuring global settings, see Configuring Global Settings.

  • aggregates.tableCache.nioMemFS.enabled: Determines whether to use the nioMemFS storage system, which provides off-heap storage. The default value is true. If set to false, the on-heap storage is used instead.
  • aggregates.tableCache.addTableToCache.concurrency: The maximum number of concurrent in-memory add-to-cache querying processes. The default value is 1. If you observe high latency between engine startup and cache bootstrapping, try increasing this setting.

Limitations and known issues​

When using in-memory dimensional aggregates, be aware of the following limitations and issues:

  • Attributes of type DECIMAL or NUMERIC can't be used with in-memory dimensional aggregates. AtScale won't load a UDA into memory if it contains an attribute of type DECIMAL or NUMERIC.
  • UDAs that contain metrics are not loaded into memory.
  • This functionality only supports equal (=) operators. Queries that use the LIKE function are not routed to the in-memory datastore.
  • UDAs that use hierarchy levels or secondary attributes that have custom empty members defined are not loaded into memory.

Troubleshooting​

Memory issues​

If you observe memory issues with the off-heap feature, it is likely related to the XX:MaxDirectMemorySize JVM property. You must have access to the amount of physical memory defined by this property. If you don’t, and enough table rows are loaded into memory, you may run out and crash the machine. For more information, refer to the Oracle Java documentation.

Query failures​

If the target aggregate was a UDA pinned to the cache, remove caching: engine-memory from its SML and redeploy the model. If the failure continues, disable the cache altogether.

Frequently asked questions

Will the system still interoperate with the query result cache?

Yes, queries that would be routed to the in-memory table cache can still be serviced by the in-memory query cache.

How do I know if a query hit the table cache?

The query shows H2-Dialect as the dialect used for an outgoing subquery.

Is there a way to control how long the cache stays?

There is no typical TTL for this cache. As long as an aggregate is still valid, it may be in the cache. When an aggregate is removed or updated, then the corresponding cached aggregate is also removed.

What if the aggregates are updated (like after an aggregate rebuild)?

The cache respects the presence of new aggregates in the same way that the planner does: Invalidated/updated aggregates will never be hit if they have stale data.

How big can the cache be made?

If using off-heap (the default), this is governed by the XX:MaxDirectMemorySize JVM property. If not using the off-heap setting, this value should be much smaller than the JVM heap allocation.

Can the cache be set to persist in the disk?

Not supported. Persisting this data to disk on engine nodes violates AtScale’s current security policies.

Is the caching mechanism similar to the RDD concept in Spark?

It is similar to pinning Spark SQL tables in memory, but RDD is a different concept.

What determines when the cache will be flushed or reloaded?

When aggregates are removed, added, or updated, their corresponding cached entries are updated accordingly.