Interface DecompressionCodec

All Known Subinterfaces:
IndexedDecompressionCodec, SplittableDecompressionCodec

public interface DecompressionCodec
SPI for decompression codecs. Plugins implement this to provide decompression for compound extensions like .csv.gz or .ndjson.gz.

When a path ends with a known compression extension (e.g. .gz), the framework strips it, resolves the inner format, and wraps the raw StorageObject stream with decompress(InputStream) before delegating to the inner format reader.

Stream-only codecs (gzip, zstd) do not support random access; formats that require StorageObject.newStream(long, long) (e.g. Parquet, ORC) are not supported for compressed files.

  • Method Summary

    Modifier and Type
    Method
    Description
    Wraps the raw compressed input stream with a decompressing stream.
    File extensions this codec handles (with leading dot, e.g.
    Codec name for logging and diagnostics (e.g.
  • Method Details

    • name

      String name()
      Codec name for logging and diagnostics (e.g. "gzip").
    • extensions

      List<String> extensions()
      File extensions this codec handles (with leading dot, e.g. [".gz", ".gzip"]).
    • decompress

      InputStream decompress(InputStream raw) throws IOException
      Wraps the raw compressed input stream with a decompressing stream.
      Parameters:
      raw - the compressed input stream
      Returns:
      an input stream that yields decompressed bytes
      Throws:
      IOException