Class ParquetStorageObjectAdapter
java.lang.Object
org.elasticsearch.xpack.esql.datasource.parquet.ParquetStorageObjectAdapter
- All Implemented Interfaces:
org.apache.parquet.io.InputFile
Adapter that wraps a StorageObject to implement Parquet's InputFile interface.
This allows using our storage abstraction with Parquet's ParquetFileReader.
Key features:
- Uses only range reads (
newStream(position, length)) — never full-object GET - Sliding window cache (default 4MB) to amortize seeks and avoid
InputStream.skip - Optimized for remote storage (S3, HTTP) where full GET and skip-download are expensive
- No Hadoop dependencies — uses pure Java InputStream
-
Constructor Summary
ConstructorsConstructorDescriptionParquetStorageObjectAdapter(StorageObject storageObject) Creates an adapter with the default 4MB sliding window. -
Method Summary
Modifier and TypeMethodDescriptionstatic ParquetStorageObjectAdapterforRange(StorageObject storageObject, long rangeBytes) Creates an adapter with an adaptive window sized to cover the given byte range.longorg.apache.parquet.io.SeekableInputStream
-
Constructor Details
-
ParquetStorageObjectAdapter
Creates an adapter with the default 4MB sliding window.
-
-
Method Details
-
forRange
Creates an adapter with an adaptive window sized to cover the given byte range. This allows all column chunks within a small row-group split to be fetched in a single I/O instead of incurring multiple range GETs with the default 4 MiB window.- Parameters:
rangeBytes- byte span of the range being read; clamped to [DEFAULT_WINDOW_SIZE,MAX_WINDOW_SIZE]
-
getLength
- Specified by:
getLengthin interfaceorg.apache.parquet.io.InputFile- Throws:
IOException
-
newStream
- Specified by:
newStreamin interfaceorg.apache.parquet.io.InputFile- Throws:
IOException
-