Class SchemaReconciliation.ColumnMapping

java.lang.Object
org.elasticsearch.xpack.esql.datasources.SchemaReconciliation.ColumnMapping
All Implemented Interfaces:
Writeable
Enclosing class:
SchemaReconciliation

public static final class SchemaReconciliation.ColumnMapping extends Object implements Writeable
Maps unified schema column positions to file-local column positions. Handles both planning-time use (with DataType references) and wire serialization (via Writeable), so there is no separate "wire" class.

Cast types are serialized via StreamOutput.writeEnum(E)/StreamInput.readEnum(java.lang.Class<E>) (ordinal-based). The ordinal mapping is pinned by an assertEnumSerialization test so any reordering or insertion is caught at test time.

Coordinator sharing: FileSplitProvider passes the same instance to all splits from the same file. A dedup cache ensures files with content-equal mappings share a single object. Duplication only occurs during wire serialization.

Wire-size analysis: for a file split into K chunks with N unified columns, the overhead is K * (4*N + N) bytes when casts are present (one VInt ordinal per cast), or K * 4*N when no casts are needed. For typical schemas (N < 200) and split counts (K < 50), this is well under 50 KB total — negligible next to the data payload. See CoalescedSplit for approaches to eliminate per-split duplication on the wire for very wide schemas.

Bitset alternative: the int[] currently encodes both "missing" (-1) and "local index" for present columns. A bitset could represent missing-column flags in ceil(N/8) bytes and store only the present-column indices, saving ~50% on the int[] for schemas where most columns are absent. This trade-off only matters for very wide schemas and is left as a future optimisation.

  • Constructor Details

  • Method Details

    • writeTo

      public void writeTo(StreamOutput out) throws IOException
      Specified by:
      writeTo in interface Writeable
      Throws:
      IOException
    • columnCount

      public int columnCount()
    • localIndex

      public int localIndex(int globalIndex)
    • cast

      @Nullable public DataType cast(int globalIndex)
    • hasMissingColumns

      public boolean hasMissingColumns()
    • hasCasts

      public boolean hasCasts()
    • isIdentity

      public boolean isIdentity()
    • equals

      public boolean equals(Object o)
      Overrides:
      equals in class Object
    • hashCode

      public int hashCode()
      Overrides:
      hashCode in class Object