Package org.elasticsearch.index.codec.tsdb.pipeline


package org.elasticsearch.index.codec.tsdb.pipeline
Composable encoding pipeline for TSDB numeric doc values.

A pipeline is an ordered sequence of transform stages followed by a terminal payload stage. Transform stages (delta, offset, GCD) reduce value entropy; the payload stage (bit-packing) serializes the result.

Stage types

StageSpec is a sealed hierarchy with two markers: TransformSpec for chainable transform stages and PayloadSpec for terminal stages that serialize values to bytes. Each stage type has a persisted byte identifier (StageId) that serves as the wire-format contract.

Construction vs wire format

PipelineConfig provides a type-safe fluent builder for assembling pipelines at construction time. PipelineDescriptor is the compact wire-format representation (stage IDs, block size, data type) persisted in the metadata file, enabling per-field codec selection. The two are deliberately separate: PipelineConfig carries rich stage specifications (StageSpec) while PipelineDescriptor stores only the byte identifiers needed for decoding. FieldDescriptor wraps the descriptor with a format version for segment compatibility.

Encoding and decoding

BlockFormat is the block-level entry point for writing and reading encoded values. EncodingContext and DecodingContext carry per-block state (bitmap, value count) and provide metadata I/O via MetadataWriter / MetadataReader, decoupling individual stages from the block layout.

Block layout

   [bitmap][payload][stage metadata (reverse stage order)]
 

Metadata is written after the payload so the decoder can read every section in a single forward pass with no seeking or buffering. See BlockFormat for details.

Usage

Building a pipeline configuration:


 PipelineConfig config = PipelineConfig.forLongs(128)
     .delta()
     .offset()
     .gcd()
     .bitPack();