Class ParquetFormatReader
java.lang.Object
org.elasticsearch.xpack.esql.datasource.parquet.ParquetFormatReader
- All Implemented Interfaces:
Closeable,AutoCloseable,FormatReader,RangeAwareFormatReader
FormatReader implementation for Parquet files.
Uses Parquet's native ParquetFileReader with our StorageObject abstraction. Produces ESQL Page batches directly without requiring Arrow as an intermediate format.
Key features:
- Works with any StorageProvider (HTTP, S3, local)
- Efficient columnar reading with column projection
- No Hadoop dependencies in the core path
- Direct conversion from Parquet to ESQL blocks
-
Nested Class Summary
Nested classes/interfaces inherited from interface org.elasticsearch.xpack.esql.datasources.spi.FormatReader
FormatReader.SchemaResolutionNested classes/interfaces inherited from interface org.elasticsearch.xpack.esql.datasources.spi.RangeAwareFormatReader
RangeAwareFormatReader.SplitRange -
Field Summary
Fields inherited from interface org.elasticsearch.xpack.esql.datasources.spi.FormatReader
NO_LIMIT -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionvoidclose()discoverSplitRanges(StorageObject object) metadata(StorageObject object) read(StorageObject object, FormatReadContext context) readRange(StorageObject object, List<String> projectedColumns, int batchSize, long rangeStart, long rangeEnd, List<Attribute> resolvedAttributes, ErrorPolicy errorPolicy) Reads only row groups whose starting position falls within[rangeStart, rangeEnd).withPushedFilter(Object pushedFilter) Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitMethods inherited from interface org.elasticsearch.xpack.esql.datasources.spi.FormatReader
defaultErrorPolicy, defaultSchemaResolution, read, readAsync, schema, supportsNativeAsync, withConfig, withSchema
-
Constructor Details
-
ParquetFormatReader
-
-
Method Details
-
withPushedFilter
- Specified by:
withPushedFilterin interfaceFormatReader
-
filterPushdownSupport
- Specified by:
filterPushdownSupportin interfaceFormatReader
-
metadata
- Specified by:
metadatain interfaceFormatReader- Throws:
IOException
-
read
public CloseableIterator<Page> read(StorageObject object, FormatReadContext context) throws IOException - Specified by:
readin interfaceFormatReader- Throws:
IOException
-
aggregatePushdownSupport
- Specified by:
aggregatePushdownSupportin interfaceFormatReader
-
formatName
- Specified by:
formatNamein interfaceFormatReader
-
fileExtensions
- Specified by:
fileExtensionsin interfaceFormatReader
-
close
- Specified by:
closein interfaceAutoCloseable- Specified by:
closein interfaceCloseable- Throws:
IOException
-
discoverSplitRanges
public List<RangeAwareFormatReader.SplitRange> discoverSplitRanges(StorageObject object) throws IOException - Specified by:
discoverSplitRangesin interfaceRangeAwareFormatReader- Throws:
IOException
-
readRange
public CloseableIterator<Page> readRange(StorageObject object, List<String> projectedColumns, int batchSize, long rangeStart, long rangeEnd, List<Attribute> resolvedAttributes, ErrorPolicy errorPolicy) throws IOException Reads only row groups whose starting position falls within[rangeStart, rangeEnd). errorPolicy is accepted for interface compliance but not applied — Parquet errors are structural (corrupt page, schema mismatch) rather than row-level.- Specified by:
readRangein interfaceRangeAwareFormatReader- Throws:
IOException
-