All Superinterfaces:
org.apache.lucene.util.Accountable, AutoCloseable, BlockLoader.Block, Closeable, org.elasticsearch.core.RefCounted, org.elasticsearch.core.Releasable, Writeable
All Known Subinterfaces:
AggregateMetricDoubleBlock, BooleanBlock, BytesRefBlock, DoubleBlock, ExponentialHistogramBlock, FloatBlock, HistogramBlock, IntBlock, LongBlock, LongRangeBlock, TDigestBlock
All Known Implementing Classes:
AbstractArrowBufBlock, AbstractDelegatingCompoundBlock, AggregateMetricDoubleArrayBlock, BooleanArrayBlock, BooleanArrowBufBlock, BooleanBigArrayBlock, BooleanVectorBlock, BytesRefArrayBlock, BytesRefArrowBufBlock, BytesRefVectorBlock, CompositeBlock, ConstantNullBlock, DocBlock, DoubleArrayBlock, DoubleArrowBufBlock, DoubleBigArrayBlock, DoubleVectorBlock, Float16ArrowBufBlock, FloatArrayBlock, FloatArrowBufBlock, FloatBigArrayBlock, FloatVectorBlock, Int16ArrowBufBlock, Int8ArrowBufBlock, IntArrayBlock, IntArrowBufBlock, IntBigArrayBlock, IntVectorBlock, LongArrayBlock, LongArrowBufBlock, LongBigArrayBlock, LongMul1kArrowBufBlock, LongRangeArrayBlock, LongVectorBlock, OrdinalBytesRefBlock, TDigestArrayBlock, UInt16ArrowBufBlock, UInt32ArrowBufBlock, UInt8ArrowBufBlock

public interface Block extends org.apache.lucene.util.Accountable, BlockLoader.Block, Writeable, org.elasticsearch.core.RefCounted, org.elasticsearch.core.Releasable
A Block is a columnar representation of homogenous data. It has a position (row) count, and various data retrieval methods for accessing the underlying data that is stored at a given position (IntBlock.getInt(int), LongBlock.getLong(int), BytesRefBlock.getBytesRef(int, org.apache.lucene.util.BytesRef)).

Reading

The usual way to read a block looks like:


 for (int p = 0; p < block.getPositionCount(); p++) {
     int count = block.getValueCount(p);
     switch (count) {
         case 0 -> // Do stuff for nulls
         case 1 -> {
             // Do stuff with single valued data
             int v = block.getInt(block.getFirstValueIndex(p));
             ...
         }
         default -> {
             // Do stuff with multi-valued data
             int first = block.getFirstValueIndex(p);
             int end = first + count;
             for (int i = first; i < end; i++) {
                 int v = block.getInt(i);
             }
         }
     }
 }
 

But that's a ton of work! It's quite common that the Block itself represents dense data. In that case, it'll return non-null from asVector() which are much faster and easier to iterate. So generally you'll see code like:


 IntVector vector = block.asVector();
 if (vector == null) {
     // iterate the Block as above
 } else {
     // iterate the Vector as documented in Vector
 }
 

Reference counted

Blocks are reference counted. The JVM itself manages the pointers and GCs as soon as there are no pointers, but we also maintain a reference count so we can decrement a CircuitBreaker when we no longer reference the Block.

When you build a Block it's reference counter is set to 1. If you want to return a few copies of the Block in the same Page you should RefCounted.incRef() it.

When a Block is unused it's refs must be decremented to 0. You do that with RefCounted.decRef() or Releasable.close(). Those two methods are the same, but folks generally use Block in try-with-resources like:


 try (
     IntBlock lhs = lhsEval.eval(page);
     IntBlock rhs = rhsEval.eval(page);
     IntBlock.Builder builder = blockFactory.newIntBlockBuilder(lhs.getPositionCount());
 ) {
     for (int p = 0; p < lhs.getPositionCount(); p++) {
         // do stuff
     }
     return
 }
 

lhsEval.eval(page) returns a Block. If it builds the Block on the fly it'll have a reference count of 1 and the Releasable.close() called by the try-with-resources will discard it. If it is read from the Page then the read process will RefCounted.incRef() and the Releasable.close() will just decrement the counter.

Thread safety

Blocks are immutable.

Blocks can be passed between threads as long as no two threads hold a reference to the Block at the same time. That's important because Driver can shift from thread to thread while it is running.

To pass a Block to another Driver, you must first call allowPassingToDifferentDriver().

  • Field Details

    • ESQL_AGGREGATE_METRIC_DOUBLE_BLOCK

      static final TransportVersion ESQL_AGGREGATE_METRIC_DOUBLE_BLOCK
    • MAX_LOOKUP

      static final long MAX_LOOKUP
      The maximum number of values that can be added to one position via lookup. TODO maybe make this everywhere?
      See Also:
    • PAGE_MEM_OVERHEAD_PER_BLOCK

      static final int PAGE_MEM_OVERHEAD_PER_BLOCK
      We do not track memory for pages directly (only for single blocks), but the page memory overhead can still be significant, especially for pages containing thousands of blocks. For now, we approximate this overhead, per block, using this value. The exact overhead per block would be (more correctly) RamUsageEstimator.NUM_BYTES_OBJECT_REF, but we approximate it with RamUsageEstimator.NUM_BYTES_OBJECT_ALIGNMENT to avoid further alignments to object size (at the end of the alignment, it would make no practical difference). We uplift it * 4 based on experiments with many small pages.
    • SERIALIZE_BLOCK_VALUES

      static final byte SERIALIZE_BLOCK_VALUES
      Serialization type for blocks: 0 and 1 replace false/true used in pre-8.14
      See Also:
    • SERIALIZE_BLOCK_VECTOR

      static final byte SERIALIZE_BLOCK_VECTOR
      See Also:
    • SERIALIZE_BLOCK_ARRAY

      static final byte SERIALIZE_BLOCK_ARRAY
      See Also:
    • SERIALIZE_BLOCK_BIG_ARRAY

      static final byte SERIALIZE_BLOCK_BIG_ARRAY
      See Also:
    • SERIALIZE_BLOCK_ORDINAL

      static final byte SERIALIZE_BLOCK_ORDINAL
      See Also:
  • Method Details

    • asVector

      Vector asVector()
      Returns an efficient dense single-value view of this block. Null, if the block is not dense single-valued. That is, if mayHaveNulls returns true, or getTotalValueCount is not equal to getPositionCount.
      Returns:
      an efficient dense single-value view of this block
    • getPositionCount

      int getPositionCount()
      Returns the number of positions (rows) in this block. See class javadoc for the usual way to iterate these positions.
      Returns:
      the number of positions (rows) in this block
    • getFirstValueIndex

      int getFirstValueIndex(int position)
      Returns the index of the first value for the given position. See class javadoc for the usual way to iterate these positions.

      For densely packed data this will return its parameter unchanged. For fields with null values or multivalued fields, this will shift. Here's an example:

      
           0   <---+
           1       | Values at first position
           2       |
           3   <---+
           5   <---- Value at second position
           6   <---+ Values at third position
           7   <---+
       

      This represents three rows. The first has the value [0, 1, 2, 3]. The second has the value 5. The third has the value [6, 7]. This method will return 0 for the first position, 4 for the second, and 5 for the third.

      Returns:
      the index of the first value for the given position
    • getValueCount

      int getValueCount(int position)
      Returns the number of values for the given position. See class javadoc for the usual way to iterate these positions.

      For densely packed data this will return 1. For nulls this will return 0. For multivalued fields, this will return the number of values. Here's an example:

      
           0   <---+
           1       | Values at first position
           2       |
           3   <---+
           5   <---- Value at second position
           6   <---+ Values at third position
           7   <---+
       

      This represents three rows. The first has the value [0, 1, 2, 3]. The second has the value 5. The third has the value [6, 7]. This method will return 4 for the first position, 1 for the second, and 2 for the third.

      Returns:
      the number of values for the given position
    • getTotalValueCount

      int getTotalValueCount()
      Returns the total number of values in this block not counting nulls. This powers the COUNT aggregation and is used to report the number of fields loaded by ESQL.
      Returns:
      the total number of values in this block not counting nulls
    • elementType

      ElementType elementType()
      Returns the element type of this block.
      Returns:
      the element type of this block
    • blockFactory

      BlockFactory blockFactory()
      The block factory associated with this block.
    • allowPassingToDifferentDriver

      void allowPassingToDifferentDriver()
      Before passing a Block to another Driver, it is necessary to switch the owning block factory to its parent, which is associated with the global circuit breaker. This ensures that when the new driver releases this Block, it returns memory directly to the parent block factory instead of the local block factory of this Block. This is important because the local block factory is not thread safe and doesn't support simultaneous access by more than one thread.
    • isReleased

      boolean isReleased()
      Tells if this block has been released. A block is released by calling its Releasable.close() or RefCounted.decRef() methods.
      Returns:
      true iff the block's reference count is zero.
    • isNull

      boolean isNull(int position)
      Parameters:
      position - the position
      Returns:
      true if the value stored at the given position is null, false otherwise
    • mayHaveNulls

      boolean mayHaveNulls()
      Returns:
      true if some values might be null. False, if all values are guaranteed to be not null.
    • areAllValuesNull

      boolean areAllValuesNull()
      Returns:
      true if all values in this block are guaranteed to be null.
    • mayHaveMultivaluedFields

      boolean mayHaveMultivaluedFields()
      Can this block have multivalued fields? Blocks that return false will never return more than one from getValueCount(int). This may return true for Blocks that do not have multivalued fields, but it will always answer quickly.
    • doesHaveMultivaluedFields

      boolean doesHaveMultivaluedFields()
      Does this block have multivalued fields? Unlike mayHaveMultivaluedFields() this will never return a false positive. In other words, if this returns true then there are positions for which getValueCount(int) will return more than 1. This will answer quickly if it can but may have to check all positions.
    • filter

      Block filter(boolean mayContainDuplicates, int... positions)
      Creates a new block that only exposes the positions provided.
      Parameters:
      mayContainDuplicates - may the positions array contain duplicate positions?
      positions - the positions to retain
      Returns:
      a filtered block
    • keepMask

      Block keepMask(BooleanVector mask)
      Build a Block with the same values as this Block, but replacing all values for which mask.getBooleanValue(position) returns false with null. The mask vector must be at least as long as this Block.
    • lookup

      org.elasticsearch.core.ReleasableIterator<? extends Block> lookup(IntBlock positions, ByteSizeValue targetBlockSize)
      Builds an Iterator of new Blocks with the same elementType() as this Block whose values are copied from positions in this Block. It has the same number of positions as the positions parameter.

      For example, if this block contained [a, b, [b, c]] and were called with the block [0, 1, 1, [1, 2]] then the result would be [a, b, b, [b, b, c]].

      This process produces count(this) * count(positions) values per positions which could be quite large. Instead of returning a single Block, this returns an Iterator of Blocks containing all of the promised values.

      The returned ReleasableIterator may retain a reference to the positions parameter. Close it to release those references.

      This block is built using the same BlockFactory as was used to build the positions parameter.

    • mvOrdering

      Block.MvOrdering mvOrdering()
      How are multivalued fields ordered?
    • mvDeduplicated

      default boolean mvDeduplicated()
      Are multivalued fields de-duplicated in each position
    • mvSortedAscending

      default boolean mvSortedAscending()
      Are multivalued fields sorted ascending in each position
    • slice

      Block slice(int beginInclusive, int endExclusive)
      Return a subset of this Block from position beginInclusive to position endExclusive. This may return the same instance if the range covers all positions, but if it does it will RefCounted.incRef() it.

      NOTE: Implementations will not try to optimize zero length slices as we expect them to be rare.

    • expand

      Block expand()
      Expand multivalued fields into one row per value. Returns the same block if there aren't any multivalued fields to expand. The returned block needs to be closed by the caller to release the block's resources.
    • insertNulls

      default Block insertNulls(IntVector before)
      Build a Block with a null inserted before each listed position.

      Note: before must be non-decreasing.

    • deepCopy

      Block deepCopy(BlockFactory blockFactory)
      Make a deep copy of this Block using the provided BlockFactory, likely copying all data.
    • writeTo

      void writeTo(StreamOutput out) throws IOException
      Writes only the data of the block to a stream output. This method should be used when the type of the block is known during reading.
      Specified by:
      writeTo in interface Writeable
      Throws:
      IOException
    • writeTypedBlock

      static void writeTypedBlock(Block block, StreamOutput out) throws IOException
      Writes the type of the block followed by the block data to a stream output. This should be paired with readTypedBlock(BlockStreamInput)
      Throws:
      IOException
    • readTypedBlock

      static Block readTypedBlock(BlockStreamInput in) throws IOException
      Reads the block type and then the block data from a stream input This should be paired with writeTypedBlock(Block, StreamOutput)
      Throws:
      IOException
    • supportsAggregateMetricDoubleBlock

      static boolean supportsAggregateMetricDoubleBlock(TransportVersion version)