Class PlannerSettings
-
Nested Class Summary
Nested Classes -
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final Setting<ByteSizeValue> Circuit breaker space reserved for each ordinalsBlockLoader.Reader.static final Setting<ByteSizeValue> Circuit breaker space reserved for each scriptBlockLoader.Reader.Multiplier applied to the RAM estimate of aBytesRefArrayVectorwhose backing values exceedBYTES_REF_RAM_OVERESTIMATE_THRESHOLD.static final Setting<ByteSizeValue> When aBytesRefArrayVector's average value length exceeds this size, the RAM estimate is multiplied byBYTES_REF_RAM_OVERESTIMATE_FACTORto account for untracked overhead in large byte arrays.static final Setting<DataPartitioning> static final PlannerSettingsDefaults.When loading from a multi-leaf doc vector that maps to a single shard and segment, the reader switches to a doc-sequential iteration order if the number ofBYTES_REFfields exceeds this threshold.The minimum number of documents in a shard before we select theDataPartitioning.AutoStrategyforDataPartitioning.AUTOthe default ofDEFAULT_DATA_PARTITIONING.static final Setting<ByteSizeValue> Maximum number of keyword sort fields allowed when pushing TopN to Lucene.The threshold number of grouping keys for a partial aggregation to start emitting intermediate results early.The uniqueness threshold of grouping keys for partial aggregation to start emitting keys early.If we're loading more than this many fields at a time we discard column loaders after each page regardless of whether we can reuse them.Multiplier applied tolastKnownSourceSizeto pre-reserve memory on the circuit breaker before loading_source.static final Setting<ByteSizeValue> -
Constructor Summary
ConstructorsConstructorDescriptionPlannerSettings(DataPartitioning defaultDataPartitioning, int docsThresholdForAutoPartitioning, ByteSizeValue valuesLoadingJumboSize, int luceneTopNLimit, ByteSizeValue intermediateLocalRelationMaxSize, int partialEmitKeysThreshold, double partialEmitUniquenessThreshold, int reuseColumnLoadersThreshold, ByteSizeValue blockLoaderSizeOrdinals, ByteSizeValue blockLoaderSizeScript, int maxKeywordSortFields, double sourceReservationFactor, ByteSizeValue bytesRefRamOverestimateThreshold, double bytesRefRamOverestimateFactor, int docSequenceBytesRefFieldThreshold) Create. -
Method Summary
Modifier and TypeMethodDescriptionCircuit breaker space reserved for each ordinalsBlockLoader.Reader.blockLoaderSizeOrdinals(ByteSizeValue blockLoaderSizeOrdinals) Circuit breaker space reserved for each scriptBlockLoader.Reader.blockLoaderSizeScript(ByteSizeValue blockLoaderSizeScript) doublebytesRefRamOverestimateFactor(double bytesRefRamOverestimateFactor) bytesRefRamOverestimateThreshold(ByteSizeValue bytesRefRamOverestimateThreshold) defaultDataPartitioning(DataPartitioning defaultDataPartitioning) intdocSequenceBytesRefFieldThreshold(int docSequenceBytesRefFieldThreshold) intdocsThresholdForAutoPartitioning(int docsThresholdForAutoPartitioning) intermediateLocalRelationMaxSize(ByteSizeValue intermediateLocalRelationMaxSize) intMaximumLIMITthat we're willing to push to Lucene's topn.luceneTopNLimit(int luceneTopNLimit) intmaxKeywordSortFields(int maxKeywordSortFields) intpartialEmitKeysThreshold(int partialEmitKeysThreshold) doublepartialEmitUniquenessThreshold(double partialEmitUniquenessThreshold) intIf we're loading more than this many fields at a time we discard column loaders after each page regardless of whether we can reuse them.reuseColumnLoadersThreshold(int reuseColumnLoadersThreshold) settings()doublesourceReservationFactor(double sourceReservationFactor) valuesLoadingJumboSize(ByteSizeValue valuesLoadingJumboSize)
-
Field Details
-
DEFAULT_DATA_PARTITIONING
-
DOC_THRESHOLD_AUTO_PARTITIONING
The minimum number of documents in a shard before we select theDataPartitioning.AutoStrategyforDataPartitioning.AUTOthe default ofDEFAULT_DATA_PARTITIONING. For shards with documents below the threshold theDataPartitioning.SHARDwill be used forDataPartitioning.AUTO. -
VALUES_LOADING_JUMBO_SIZE
-
LUCENE_TOPN_LIMIT
-
INTERMEDIATE_LOCAL_RELATION_MAX_SIZE
-
REDUCTION_LATE_MATERIALIZATION
-
BLOCK_LOADER_SIZE_SCRIPT
Circuit breaker space reserved for each scriptBlockLoader.Reader. The default is pretty poor estimate for the overhead of the script, but it'll do for now. We're estimating 100kb for loading ordinals from doc values and 2kb for loading numbers from doc values. This 300kb is sort of a shrug because we don't know what the script will do, and we don't know how many doc values it'll load. And, we're not sure much memory the script itself will actually use. -
BLOCK_LOADER_SIZE_ORDINALS
Circuit breaker space reserved for each ordinalsBlockLoader.Reader. Measured in heap dumps from 3.5kb to 65kb. This is an intentional overestimate. -
PARTIAL_AGGREGATION_EMIT_KEYS_THRESHOLD
The threshold number of grouping keys for a partial aggregation to start emitting intermediate results early. While emitting partial results can reduce memory pressure and allow for incremental downstream processing, it might emit the same keys multiple times, incurring serialization and network overhead. This setting, in conjunction withPARTIAL_AGGREGATION_EMIT_UNIQUENESS_THRESHOLD, helps mitigate these costs by only triggering early emission when a significant number of keys have been collected and most are unique, thus lowering the probability of re-emitting the same keys.NOTE that the defaults are chosen somewhat arbitrarily but are partially based on other systems. Other systems sometimes default to a lower threshold (e.g., 10,000) without a uniqueness threshold. We may lower these defaults after benchmarking more use cases.
-
PARTIAL_AGGREGATION_EMIT_UNIQUENESS_THRESHOLD
The uniqueness threshold of grouping keys for partial aggregation to start emitting keys early. This threshold controls the trade-off between the benefits of early emission and the costs of repeated serialization and network transfer of the same keys. A higher uniqueness ratio ensures early emission only if keys are not repeatedly seen in incoming data and are unlikely to appear again in future data. -
REUSE_COLUMN_LOADERS_THRESHOLD
If we're loading more than this many fields at a time we discard column loaders after each page regardless of whether we can reuse them. They have significant per-field memory overhead so discarding them between pages allows some queries that would have OOMed to succeed. Usually the paths that need very high performance don't load more than a handful of fields at a time, so they do reuse fields. -
MAX_KEYWORD_SORT_FIELDS
Maximum number of keyword sort fields allowed when pushing TopN to Lucene. Sorting on many keyword fields in Lucene can be expensive. When exceeded, the sort falls back to the compute engine. -
SOURCE_RESERVATION_FACTOR
Multiplier applied tolastKnownSourceSizeto pre-reserve memory on the circuit breaker before loading_source. The source loading path creates large untracked allocations: scratch, SourceFilter.filterBytes() and JSON parsing creates, heap dump has shown about 8x of the actual source size in untracked memory at peak. 10x is an overestimation to prevent crashes. -
BYTES_REF_RAM_OVERESTIMATE_THRESHOLD
When aBytesRefArrayVector's average value length exceeds this size, the RAM estimate is multiplied byBYTES_REF_RAM_OVERESTIMATE_FACTORto account for untracked overhead in large byte arrays. The untracked overhead may come from loading large text fields from _source. -
BYTES_REF_RAM_OVERESTIMATE_FACTOR
Multiplier applied to the RAM estimate of aBytesRefArrayVectorwhose backing values exceedBYTES_REF_RAM_OVERESTIMATE_THRESHOLD. -
DOC_SEQUENCE_BYTES_REF_FIELD_THRESHOLD
When loading from a multi-leaf doc vector that maps to a single shard and segment, the reader switches to a doc-sequential iteration order if the number ofBYTES_REFfields exceeds this threshold. The doc-sequential path avoids the expensive backwards reorder and supports partial-page splitting bounded byjumboBytes. -
DEFAULTS
Defaults.
-
-
Constructor Details
-
PlannerSettings
public PlannerSettings(DataPartitioning defaultDataPartitioning, int docsThresholdForAutoPartitioning, ByteSizeValue valuesLoadingJumboSize, int luceneTopNLimit, ByteSizeValue intermediateLocalRelationMaxSize, int partialEmitKeysThreshold, double partialEmitUniquenessThreshold, int reuseColumnLoadersThreshold, ByteSizeValue blockLoaderSizeOrdinals, ByteSizeValue blockLoaderSizeScript, int maxKeywordSortFields, double sourceReservationFactor, ByteSizeValue bytesRefRamOverestimateThreshold, double bytesRefRamOverestimateFactor, int docSequenceBytesRefFieldThreshold) Create.
-
-
Method Details
-
settings
-
defaultDataPartitioning
-
defaultDataPartitioning
-
valuesLoadingJumboSize
-
valuesLoadingJumboSize
-
luceneTopNLimit
-
luceneTopNLimit
public int luceneTopNLimit()MaximumLIMITthat we're willing to push to Lucene's topn.Lucene's topn code was designed for search which typically fetches 10 or 30 or 50 or 100 or 1000 documents. That's as many you want on a page, and that's what it's designed for. But if you go to, say, page 10, Lucene implements this as a search for
page_size * page_numberdocs and then materializes only the lastpage_sizedocuments. Traditionally, Elasticsearch limits thatpage_size * page_numberwhich it calls the"result window". So! ESQL defaults to the same default -10,000. -
intermediateLocalRelationMaxSize
public PlannerSettings intermediateLocalRelationMaxSize(ByteSizeValue intermediateLocalRelationMaxSize) -
intermediateLocalRelationMaxSize
-
partialEmitKeysThreshold
-
partialEmitKeysThreshold
public int partialEmitKeysThreshold() -
partialEmitUniquenessThreshold
-
partialEmitUniquenessThreshold
public double partialEmitUniquenessThreshold() -
reuseColumnLoadersThreshold
-
reuseColumnLoadersThreshold
public int reuseColumnLoadersThreshold()If we're loading more than this many fields at a time we discard column loaders after each page regardless of whether we can reuse them. They have significant per-field memory overhead so discarding them between pages allows some queries that would have OOMed to succeed. Usually the paths that need very high performance don't load more than a handful of fields at a time, so they do reuse fields. -
blockLoaderSizeOrdinals
-
blockLoaderSizeOrdinals
Circuit breaker space reserved for each ordinalsBlockLoader.Reader. -
blockLoaderSizeScript
-
blockLoaderSizeScript
Circuit breaker space reserved for each scriptBlockLoader.Reader. -
maxKeywordSortFields
-
maxKeywordSortFields
public int maxKeywordSortFields() -
sourceReservationFactor
-
sourceReservationFactor
public double sourceReservationFactor() -
bytesRefRamOverestimateThreshold
public PlannerSettings bytesRefRamOverestimateThreshold(ByteSizeValue bytesRefRamOverestimateThreshold) -
bytesRefRamOverestimateThreshold
-
bytesRefRamOverestimateFactor
-
bytesRefRamOverestimateFactor
public double bytesRefRamOverestimateFactor() -
docSequenceBytesRefFieldThreshold
-
docSequenceBytesRefFieldThreshold
public int docSequenceBytesRefFieldThreshold() -
docsThresholdForAutoPartitioning
-
docsThresholdForAutoPartitioning
public int docsThresholdForAutoPartitioning()
-