Class IcebergSourceOperatorFactory

java.lang.Object
org.elasticsearch.xpack.esql.datasource.iceberg.IcebergSourceOperatorFactory
All Implemented Interfaces:
Describable, Operator.OperatorFactory, SourceOperator.SourceOperatorFactory

public class IcebergSourceOperatorFactory extends Object implements SourceOperator.SourceOperatorFactory
Factory for creating async source operators for Iceberg tables.

This factory creates operators that read data from Iceberg tables or Parquet files using:

  • Iceberg's ArrowReader for efficient vectorized columnar data reading
  • Arrow format (VectorSchemaRoot) for in-memory representation
  • Background executor thread to avoid blocking the Driver during S3 I/O

Each operator gets:

  • A shared buffer for pages
  • A background reader task that fills the buffer
  • An executor to run the background task
  • Constructor Details

    • IcebergSourceOperatorFactory

      public IcebergSourceOperatorFactory(Executor executor, String tablePath, S3Configuration s3Config, String sourceType, org.apache.iceberg.expressions.Expression filter, org.apache.iceberg.Schema schema, List<Attribute> attributes, int pageSize, int maxBufferSize)
      Parameters:
      executor - Executor for running background S3/Iceberg reads
      tablePath - Path to Iceberg table or Parquet file
      s3Config - S3 configuration (credentials, endpoint, region)
      sourceType - Type of source ("iceberg" or "parquet")
      filter - Iceberg filter expression (nullable)
      schema - Iceberg schema
      attributes - ESQL attributes (schema)
      pageSize - Number of rows per page (batch size for Vectorized Reader)
      maxBufferSize - Maximum number of pages to buffer
  • Method Details