Skip to main content

Aggregate Settings

The following settings configure aggregates at the model level.

aggregates.batch.buildFromExisting.enabled

  • Type: boolean
  • Default: true

Build aggregates from aggregates for aggregate batch rebuilds.

aggregates.batch.buildFromExisting.threshold

  • Type: integer
  • Default: 100

Use the optimized algorithm only when the number of aggregates in a batch are less than this threshold.

aggregates.batch.cube.attributes.threshold

  • Type: integer
  • Default: 250

Use the optimized algorithm only when the number of attributes in a model are less than this threshold.

aggregates.batch.cubeDataRequests.parallelism

  • Type: integer
  • Default: 2

Number of model data requests that can be made in parallel.

aggregate.batch.cube.gracePeriodOverrides.enabled

  • Type: boolean
  • Default: false

Allow specifying grace period overrides when building an aggregate batch for a model. For more information, see Rebuilding Aggregates Using the REST API.

Important

Enabling this functionality can potentially cause your system to become strained by expensive aggregate rebuilds.

aggregates.batch.max.failures

  • Type: integer
  • Default: 0

Max number of failures for a batch build, before the whole batch fails.

aggregates.batch.retry.maxAttemptsPerAggregate

  • Type: integer
  • Default: 3

The maximum number of reattempts to build a single aggregate during a single batch build. This number cannot exceed the value of AGGREGATES.BATCH.RETRY.MAXATTEMPTSPERBATCH.

aggregates.batch.retry.maxAttemptsPerBatch

  • Type: integer
  • Default: 5

The maximum number of reattempts to build aggregates during a single batch build.

aggregates.batch.reuseOrderingGraph.enabled

  • Type: boolean
  • Default: true

Use the existing ordering graph if available for rebulding aggregates.

aggregates.build.scheduled.maintenance.strategy

  • Type: string
  • Default:

Maintenance strategy for handling build of pending aggregates if the retention limit is exceeded.

Supported values:

  • BuildTopPredicted: If the number of currently existing active aggregates is equal to the retention limit + extra allowance, the active aggregates will be pruned to the retention limit and the top predicted pending instances will be built. If the pending instances + the active instances exceed the retention limit, the top predicted pending aggregates will be built up to the retention limit. Otherwise all pending instances will be built.
  • PredictionsAsUsage: Pending and Active instances will be deactivated by the Maintainer. The number of predictions will be count as usage for Pending instances.

aggregates.create.aggressiveDimensionalCopyPromotion.enabled

  • Type: boolean
  • Default: true

Enables aggressive promotion of dimensional hierarchy copies into preferred storage.

aggregates.create.allowDistinctSumMeasures.enabled

  • Type: boolean
  • Default: false

When set to True, distinct sum metrics may be included in system-defined aggregates.

aggregates.create.allowExactDistinctCountMeasures.enabled

  • Type: boolean
  • Default: false

When set to True exact distict count metrics may be included in system-defined aggregates.

aggregates.create.compression.threshold

  • Type: double
  • Default: 3.0

Specify the compression factor that aggregates proposed by the engine must meet or exceed. This factor is a metric of the quality of a proposed aggregate. It is calculated as the number of rows in the fact table divided by the estimated number of rows in a proposed aggregate.

aggregates.create.demandDefined.enabled

  • Type: boolean
  • Default: true

Enables the creation of demand-defined system aggregates.

aggregates.create.dimensionalModifications.complexityLimit

  • Type: long
  • Default: 1500000

DMAs estimated row-count limit used to prevent the execution of excessively complex queries that may be too expensive to run or may create excessively large tables.

aggregates.create.dimensionalModifications.enabled

  • Type: boolean
  • Default: true

When set to True aggregates containing dimensional modifications can be created.

aggregates.create.higherOrder.dimensionalAttributes.size

  • Type: integer
  • Default: 10

The limit of dimensional attributes in the query in order to consider build of a higher order aggregate.

aggregates.create.higherOrder.enabled

  • Type: boolean
  • Default: true

If set to true allows the Aggregate System to build aggregates on a higher level if the compression score at the lowest level is not met.

aggregates.create.includeHigherLevels.enabled

  • Type: boolean
  • Default: true

Set this to True to enable the addition of higher levels to system aggregates (without causing additional joins). For example, if an aggregate has the Day level, we will automatically add Month and Year.

aggregates.create.includeHigherLevels.maxHierarchies

  • Type: integer
  • Default: 2

Specify the maximum number of hierarchies the dimension from system aggregates should be part of, in order to enable higher level expansion. Requires aggregates.create.includeHigherLevels.enabled to be enabled.

aggregate.create.joins.allowPreventIncremental.enabled

  • Type: boolean
  • Default: true

Whether or not to consider joining to a dataset that is not safe for incremental update if it would prevent this aggregate from otherwise being an incremental aggregate.

aggregates.create.joins.compression

  • Type: double
  • Default: 100.0

Specify the minimum compression ratio for any proposed join. This ratio is calculated as the cardinality of the join key in the fact table (or in the dimension table if that is not available) to the cardinality of the grouped dimension values (i.e #(Key Cardinality) / #(Dim Table grouped by Dim Value)). Joins for which the compression ratio is below this minimum will not be used.

aggregates.create.joins.enabled

  • Type: boolean
  • Default: true

Set to True to allow the AtScale engine to use joins when defining aggregates. This setting must be set to True for the other AGGREGATES.CREATE.JOINS.* settings to have an effect.

aggregates.create.joins.maximumDepth

  • Type: integer
  • Default: 3

Specify the maximum number of dimensions that can be traversed in a join path.

aggregates.create.joins.maximumKeyCardinality

  • Type: integer
  • Default: 10000000

Specify the maximum cardinality that the AtScale engine will allow in join keys when the engine is determining whether to use a join in the definition of an aggregate. Higher cardinalities will cause the engine not to use a join.

aggregates.create.joins.nonInner.enabled

  • Type: boolean
  • Default: false

Whether to allow non-inner joins in system aggregates.

warning

Enabling this option is unsafe if a model has any role-played dimensions.

aggregates.create.joins.prime.compression

  • Type: double
  • Default: 0.99

Specify the minimum compression ratio for a proposed join from a prime query part (where aggregates cannot be stored anywhere except in preferred storage).

aggregates.create.joins.smallJoins.enabled

  • Type: boolean
  • Default: false

Whether to allow small joins to be added to system aggregates.

aggregates.create.joins.smallJoins.maximumCompressionRatio

  • Type: double
  • Default: 10.0

The maximum compression ratio for valid small joins.

aggregates.create.joins.smallJoins.maximumKeyCardinality

  • Type: integer
  • Default: 100

The maximum key cardinality to allow a small join.

aggregates.create.partition.hintedAggregate.enabled

  • Type: boolean
  • Default: true

Whether to partition hinted aggregates from query datasets using the model's partition key list. For this setting to have an effect, the setting TABLES.CREATE.PARTITIONS.ENABLED must be set to True.

aggregates.create.partition.systemDefinedAggregate.enabled

  • Type: boolean
  • Default: true

Set to True to enable the AtScale engine to partition system-defined aggregates. For this setting to have an effect, the setting TABLES.CREATE.PARTITIONS.ENABLED must be set to True.

aggregates.create.partition.systemDefinedAggregate.threshold

  • Type: double
  • Default: 50000.0

The estimated number of rows in the proposed aggregate table divided by the number of partitions, determines the estimated number of rows per partition.

aggregates.create.partition.userDefinedAggregate.enabled

  • Type: boolean
  • Default: true

Set to True to enable the AtScale engine to partition user-defined aggregates. For this setting to have an effect, the setting TABLES.CREATE.PARTITIONS.ENABLED must be set to True.

aggregates.create.narrowing.dimensional.enabled

  • Type: boolean
  • Default: true

Allow aggregate narrowing when building dimension-only aggregates.

aggregate.create.securityDimensions.enabled

  • Type: boolean
  • Default: false

When set to True, aggregates containing attributes from row security objects can be created.

aggregates.create.threshold.enabled

  • Type: boolean
  • Default: true

Set to True to turn on the setting AGGREGATES.CREATE.COMPRESSION.THRESHOLD.

aggregates.create.useIncidentalData.enabled

  • Type: boolean
  • Default: true

Set to False to disable the creation of dimensional aggregates from not strictly related dimensions using incidental data.

aggregates.create.widening.enabled

  • Type: boolean
  • Default: true

Set to True to allow the engine to define new aggregates as wider versions of existing aggregates. Wider aggregates contain more metrics than their predecessors.

aggregates.create.widening.measure.limit

  • Type: integer
  • Default: 20

Specify the maximum number of metrics that can be added when widening. This setting requires AGGREGATES.CREATE.WIDENING.ENABLED to be set to True.

aggregates.create.withDistinctCounts.queryLevel.enabled

  • Type: boolean
  • Default: true

Allows the creation of distinct count aggregates without the query part join keys. When true, distinct count aggregates are created at the hierarchy level specified in the query. When disabled, distinct count aggregates are created at the leaf level of the hierarchy. Note that this setting requires the aggregates.create.allowExactDistinctCountMeasures.enabled global setting to also be enabled.

aggregates.create.withoutCompressionEstimate.enabled

  • Type: boolean
  • Default: false

Allow new aggregates to be created without estimated compression ratios (i.e. when statistics are not available).

aggregates.dataWarehouseCacheTableRequests.enabled

  • Type: boolean
  • Default: true

Toggles whether to cache data warehouse table requests.

aggregates.dataWarehouseCacheTableRequests.maximumRowCount

  • Type: integer
  • Default: 50000

The maximum number of rows for a data warehouse table request to be cacheable, if caching is enabled. A negative or zero value is interpreted as allowing unlimited rows.

aggregate.definitions.import.size.limit

  • Type: integer
  • Default:

The size limit, in megabytes (1MB = 1024 * 1024 bytes), for aggregate definition files provided to an import request. No definitions from files exceeding this limit will be imported.

aggregates.demandDefined.disabled.deactivate

  • Type: boolean
  • Default:

Whether to deactivate aggregate active instances if demand-defined aggregates are disabled. This setting will take effect only if AGGREGATES.CREATE.DEMANDDEFINED.ENABLED is set to false.

aggregates.dimensional.allowJoinsToSecondaries.enabled

  • Type: boolean
  • Default: true

Whether or not to allow joins to secondary attributes to be filtered out for completeness testing.

aggregates.dimensional.build

  • Type: boolean
  • Default: true

Set to True to allow the engine to create aggregates that contain dimensional attributes only. Such aggregates can be useful in Tableau for queries against fact tables that contain degenerate dimensions.

aggregates.dimensionalModifications.ignorefromAggregate

  • Type: boolean
  • Default:

Whether to ignore from-aggregate attribute in DMAs when comparing plans. Such plans can result in creation of two identical aggregates.

aggregates.dimensionalModifications.parallelPeriodRanges.enabled

  • Type: boolean
  • Default:

Whether creation of DMAs built from ParallelPeriod used inside of Range are enabled. Note that such aggregates may result in huge queries against the data warehouse.

aggregates.dimensionalModifications.retentionLimit

  • Type: integer
  • Default: 30

The number of active instances of dimensionally modified aggregates retained per model.

aggregate.incrementalUpdate.allFragmentMaterializations.duration

  • Type: duration
  • Default:

Specify the maximum length of time to allow for an incremental build of an aggregate table.

aggregate.incrementalUpdate.enabled

  • Type: boolean
  • Default: true

Set to True to use incremental builds for all of the aggregates for a model when the fact dataset uses an incremental indicator. Full builds are still done for user-defined aggregates that are joins or unions of two or more tables.

aggregate.incrementalUpdate.indicatorLookup.addPreviousMaxConstraint

  • Type: boolean
  • Default: true

Whether or not to use the previous MAX value of the indicator as a constraint to improve performance.

aggregate.incrementalUpdate.indicatorLookup.duration

  • Type: duration
  • Default: 30 minutes

Maximum time allowed to lookup incremental indicator.

aggregate.incrementalUpdate.indicatorLookup.reusePerBatch

  • Type: boolean
  • Default: true

Whether or not to use the same max indicator lookup per batch per indicator to improve performance.

aggregate.incrementalUpdates.immutable.enabled

  • Type: boolean
  • Default: true

Set to True to enable incremental builds of aggregates that use joins on rarely changing dimensions.

aggregate.incrementalUpdate.maxConsecutiveStaticFragments

  • Type: integer
  • Default:

Specify the maximum number of fragments to allow for each incrementally built aggregate. When this threshold is exceeded, the fragments are consolidated. Lower values relative to the default result in slower consolidations and faster queries. Higher values result in faster consolidation and slower queries.

aggregate.incrementalUpdates.semiAdditive.enabled

  • Type: boolean
  • Default: false

Set to True to enable support for incremental updates on semi-additive metrics.

aggregates.largeTableOptimization.distributionKeyColumn.minimumCardinality

  • Type: long
  • Default: 30

The minimum cardinality required for the highest cardinality dimensional attribute to be used as a table distribution or clustering key.

aggregates.largeTableOptimization.enabled

  • Type: boolean
  • Default: false

Enables consideration of optimizations for large aggregate tables, such as column-based clustering or distribution.

aggregates.largeTableOptimization.minimumEstimatedRows

  • Type: long
  • Default: 100000

The minimum estimated row count required for the aggregate system to consider applying optimizations such as clustering or distribution.

aggregates.maintenance.deactivateUnused.enabled

  • Type: boolean
  • Default:

Whether to deactivate system-defined aggregates that have been unused for the required time to live (as indicated by aggregates.maintenance.zeroUtilizationTTL).

aggregate.maintenance.job.cleanup-invalid.parallelism

  • Type: integer
  • Default:

Number of cleanup requests per model.

aggregates.maintenance.zeroUtilizationTTL

  • Type: duration
  • Default:

The minimum time with no usages that must pass before an aggregate will be deactivated.

aggregates.new.build.scheduled

  • Type: boolean
  • Default: false

When true, new aggregate instance builds are postponed until the next scheduled batch build for the model. If false, new aggregate instances may be queued for building at any time.

aggregate.partition.bigquery.range.end

  • Type: long
  • Default: 10000

This value is used as the upper bound for integer value partitioning on Google BigQuery.

aggregate.partition.bigquery.range.interval

  • Type: long
  • Default: 10

This value is used as the interval for integer value partitioning on Google BigQuery.

aggregate.partition.bigquery.range.start

  • Type: long
  • Default: 0

This value is used as the lower bound for integer value partitioning on Google BigQuery

aggregates.prediction.checkForNonAdditiveMeasuresAndConstraints.enabled

  • Type: boolean
  • Default:

When set to True, aggregate candidates traced to queries selecting distinct count metrics and with a non-equals WHERE constraint are rejected because they attempt to re-aggregate the data.

aggregates.predictionDefined.build.scheduled

  • Type: boolean
  • Default: false

When false, new prediction-defined aggregate build instances are queued for building following the catalog deploy event. If true, PDA builds are postponed until the next scheduled batch build for the model.

aggregates.prediction.reports.skipAggregateCreation

  • Type: boolean
  • Default:

Whether to skip aggregate creation after prediction. This setting is intended for the case when a prediction report is required without the need of an actual aggregate.

aggregates.prediction.reports.storePredictorData

  • Type: boolean
  • Default:

Whether to store reports generated during predictions in postgres.

aggregate.pruning.history.enabled

  • Type: boolean
  • Default:

Pruning history will be stored if this setting is enabled.

aggregate.pruning.history.verbose

  • Type: boolean
  • Default:

Pruning history will contain details on pruned items if this setting is enabled.

aggregates.slowBuild.cutoff

  • Type: duration
  • Default: 4 seconds

The duration cutoff for a completed aggregate build query to emit a SlowAggEvent.

aggregates.smallTableReplication.enabled

  • Type: boolean
  • Default: false

Whether to consider applying data replication to system-defined aggregate tables.

aggregates.smallTableReplication.factBasedAggs.enabled

  • Type: boolean
  • Default: false

Whether to allow aggregate table replication on fact-based aggregates.

aggregates.smallTableReplication.maximumEstimatedRows

  • Type: long
  • Default: 1000

The maximum estimated number of rows that an aggregate table can have to be considered for replication.

aggregate.speculative.allmember.enabled

  • Type: boolean
  • Default: true

Set to True to enable the AtScale engine to create, for each fact table, a speculative aggregate that contains only the metrics in the corresponding fact table.

aggregate.speculative.allmember.rerun.schemaUpdates.enabled

  • Type: boolean
  • Default: true

Whether or not to rerun all-member speculative aggregates on schema updates.

aggregate.speculative.dimensional.enabled

  • Type: boolean
  • Default: true

Set to True to enable the AtScale engine to define dimension-only speculative aggregates, which are used to populate filters in BI client software.

aggregate.speculative.dimensional.rerun.schemaUpdates.enabled

  • Type: boolean
  • Default: true

Whether or not to rerun dimensional speculative aggregates on schema updates.

aggregate.speculative.enabled

  • Type: boolean
  • Default: true

Set to True to activate the other aggregate.speculative.* settings.

aggregate.speculative.dimensional.minCompressionRatio

  • Type: integer
  • Default: 10

Specify the ratio as the number of rows in the full dimension dataset divided by the number of rows in the proposed aggregate for a level in the dimensional hierarchy.

aggregate.speculative.superAggregate.compression

  • Type: double
  • Default: 2.0

Compression of super aggregate against the data set on which gets created.

aggregate.speculative.superAggregate.enabled

  • Type: boolean
  • Default: false

Set to True to enable building a super aggregate consisting of dimensions and metrics as long as the row count is below a threshold.

aggregate.speculative.superAggregate.rerun.schemaUpdates.enabled

  • Type: boolean
  • Default: true

Whether or not to rerun super aggregates on schema updates.

aggregate.speculative.updatedStatsTrigger.delay

  • Type: duration
  • Default: 10 seconds

Time to allow stats updates to queue before re-running speculative aggregator.

aggregates.systemGenerated.activeInstance.extraAllowance

  • Type: integer
  • Default: 2

The maximum number of additional system-defined aggregates temporarily permitted per model when the retention limit is reached.

note

Setting this value too high will cause long aggregate batch build times and may impact data warehouse workloads.

aggregates.systemGenerated.activeInstance.retentionLimit

  • Type: integer
  • Default: 20

The maximum number of system-defined aggregates retained per model.

note

Setting this value too high will cause long aggregate batch build times and may impact data warehouse workloads.

aggregates.systemGenerated.higherOrder.retentionPercentage

  • Type: integer
  • Default: 30

The maximum number of active higher order aggregates per model is calculated as this percentage of the value of aggregates.systemGenerated.activeInstance.retentionLimit for each model. This pool of system-defined aggregate tables is sized and counted separately from the regular system-defined aggregate table pool.

aggregates.systemGenerated.withDistinctCountMeasure.retentionPercentage

  • Type: integer
  • Default: 40

The maximum number of active aggregates with distinct count metrics per model is calculated as this percentage of the value of aggregates.systemGenerated.activeInstance.retentionLimit for each model. This pool of system-defined aggregate tables is sized and counted separately from the regular system-defined aggregate table pool.

aggregates.systemGenerated.withDistinctSumMeasure.retentionPercentage

  • Type: integer
  • Default: 40

The maximum number of active aggregates with distinct sum metrics per model is calculated as this percentage of the value of aggregates.systemGenerated.activeInstance.retentionLimit for each model. This pool of system-defined aggregate tables is sized and counted separately from the regular system-defined aggregate table pool.

To prevent distinct sum aggregates from dominating the aggregate retention limit, AtScale defines a separate pool of distinct sum aggregates as a percentage of the overall model-scoped system-defined aggregate retention limit. The percentage is controlled by this setting, at either the engine or model level. It accepts values between 0 and 100. The size of the distinct sum aggregate pool is calculated as a percentage of the effective model value of the "retentionLimit" setting (see above).

For example, if the model's effective retention limit is 100, and aggregates.systemGenerated.withDistinctSumMeasure.retentionPercentage is 40, then the system will allow the creation of up to 40 system-defined aggregates that use distinct sum measures.

aggregates.uda.build.scheduled

  • Type: boolean
  • Default: false

When false, new user-defined aggregate build instances are queued for building following the catalog deploy event. If true, UDA builds are postponed until the next scheduled batch build for the model.

aggregates.withDistinctCounts.widening.enabled

  • Type: boolean
  • Default:

Allows widening of distinct count aggregates with other metrics (including distinct counts). This setting requires AGGREGATES.CREATE.WIDENING.ENABLED to be set to True.

aggregates.withDistinctSums.widening.enabled

  • Type: boolean
  • Default:

Allows widening of distinct sum aggregates with other metrics (including distinct sums). This setting requires AGGREGATES.CREATE.WIDENING.ENABLED to be set to True.