Package org.elasticsearch.index.codec.tsdb.pipeline
A pipeline is an ordered sequence of transform stages followed by a terminal payload stage. Transform stages (delta, offset, GCD) reduce value entropy; the payload stage (bit-packing) serializes the result.
Stage types
StageSpec is a sealed hierarchy
with two markers:
TransformSpec
for chainable transform stages and
PayloadSpec
for terminal stages that serialize values to bytes. Each stage type has a
persisted byte identifier (StageId)
that serves as the wire-format contract.
Construction vs wire format
PipelineConfig provides a
type-safe fluent builder for assembling pipelines at construction time.
PipelineDescriptor is the compact
wire-format representation (stage IDs, block size, data type) persisted in the
metadata file, enabling per-field codec selection. The two are deliberately
separate: PipelineConfig carries rich stage specifications
(StageSpec) while
PipelineDescriptor stores only the byte identifiers needed for decoding.
FieldDescriptor wraps the
descriptor with a format version for segment compatibility.
Encoding and decoding
BlockFormat is the block-level
entry point for writing and reading encoded values.
EncodingContext and
DecodingContext carry per-block
state (bitmap, value count) and provide metadata I/O via
MetadataWriter /
MetadataReader, decoupling
individual stages from the block layout.
Block layout
[bitmap][payload][stage metadata (reverse stage order)]
Metadata is written after the payload so the decoder can read every section
in a single forward pass with no seeking or buffering. See
BlockFormat for details.
Usage
Building a pipeline configuration:
PipelineConfig config = PipelineConfig.forLongs(128)
.delta()
.offset()
.gcd()
.bitPack();
-
ClassDescriptionDefines how each block of encoded values is written to the data file.Mutable per-block context for decoding, tracking the position bitmap and delegating metadata reads to the underlying
DataInput.Mutable per-block context for encoding, tracking the position bitmap, metadata buffer, and position offsets.Versioned envelope for thePipelineDescriptorwire format.Reads stage metadata values from a buffer during decoding.Writes stage metadata values to a buffer during encoding.Immutable specification for a field's encoding pipeline.Builder for long (integral) pipelines.Persisted descriptor of a pipeline configuration, tracking stage IDs, block size, and data type.The numeric data type stored in encoded blocks.Unique byte identifiers for pipeline stages, persisted in the encoded data.Sealed hierarchy capturing pipeline stage specifications.Bit-packing payload: packs values using the minimum number of bits.Delta encoding: stores differences between consecutive values.GCD factoring: divides all values by their greatest common divisor.Offset removal: subtracts the minimum value from all entries.Marker for terminal payload stages that serialize values to bytes.Marker for transform stages that can be chained in the pipeline.