Class IndexInputUtils

java.lang.Object
org.elasticsearch.simdvec.internal.IndexInputUtils

public final class IndexInputUtils extends Object
IndexInputUtils relies on preview features of the Java platform:
  • IndexInputUtils refers to one or more preview APIs: MemorySegment.
Programs can only use IndexInputUtils when preview features are enabled.
Preview features may be removed in a future release, or upgraded to permanent features of the Java platform.
Utility for obtaining a MemorySegmentPREVIEW view of data in an IndexInput and passing it to a caller-supplied action. The segment may come from a MemorySegmentAccessInput (mmap), a direct ByteBuffer view (e.g. blob-cache), or a heap copy as a last resort.

All resource management (ref-counting, buffer release) is handled internally — callers never see a closeable resource.

  • Method Details

    • canUseSegmentSlices

      public static boolean canUseSegmentSlices(org.apache.lucene.store.IndexInput input)
      Returns true if MemorySegment slices can be obtained from the specified IndexInput.
    • withSlice

      public static <R> R withSlice(org.apache.lucene.store.IndexInput in, long length, IntFunction<byte[]> scratchSupplier, org.elasticsearch.core.CheckedFunction<MemorySegmentPREVIEW,R,IOException> action) throws IOException
      Obtains a memory segment for the next length bytes of the index input, passes it to action, and returns the result. The position of the index input is advanced by length.

      This method first tries to obtain a slice via MemorySegmentAccessInput.segmentSliceOrNull(long, long). If that returns null, it tries a direct ByteBuffer view via DirectAccessInput. As a last resort it copies the data onto the heap using a byte array obtained from scratchSupplier.

      The memory segment passed to action is valid only for the duration of the call. Callers must not retain references to it.

      Parameters:
      in - the index input positioned at the data to read
      length - the number of bytes to read
      scratchSupplier - supplies a byte array of at least the requested length, used only on the heap-copy fallback path
      action - the function to apply to the memory segment
      Returns:
      the result of applying action
      Throws:
      IOException
    • withSlices

      public static <R> R withSlices(org.apache.lucene.store.IndexInput in, long[] offsets, int length, int count, IntFunction<byte[]> scratchSupplier, org.elasticsearch.core.CheckedFunction<IntFunction<MemorySegmentPREVIEW>,R,IOException> action) throws IOException
      Bulk variant of withSlice(org.apache.lucene.store.IndexInput, long, java.util.function.IntFunction<byte[]>, org.elasticsearch.core.CheckedFunction<java.lang.foreign.MemorySegment, R, java.io.IOException>)PREVIEW. Resolves count byte ranges at the given file offsets to MemorySegmentPREVIEWs and passes a resolver function to the action. The resolver maps an index [0..count) to the corresponding segment.

      This method first tries MemorySegmentAccessInput: if a single contiguous segment covers the whole input, each slice is derived from it with no per-slice allocation. Next it tries DirectAccessInput.withByteBufferSlices(long[], int, int, org.elasticsearch.core.CheckedConsumer<java.nio.ByteBuffer[], java.io.IOException>). As a last resort it copies each range onto the heap.

      The segments provided by the resolver are valid only for the duration of the action. Callers must not retain references to them.

      Parameters:
      in - the index input
      offsets - file byte offsets for each range
      length - byte length of each range (same for all)
      count - number of ranges
      scratchSupplier - supplies a byte array for the heap-copy fallback
      action - receives a function mapping index to MemorySegment
      Returns:
      the result of applying action
      Throws:
      IOException
    • withSliceAddresses

      public static boolean withSliceAddresses(org.apache.lucene.store.IndexInput in, long[] offsets, int length, int count, org.elasticsearch.core.CheckedConsumer<MemorySegmentPREVIEW,IOException> action) throws IOException
      Resolves count file ranges to native memory addresses and passes the address array to the action. Tries MemorySegmentAccessInput first (contiguous segment, pointer arithmetic), then DirectAccessInput (withByteBufferSlices). Returns false without invoking the action if neither path is available - there is no heap fallback since native addresses are required.

      Memory safety: The addresses in the addrs array are raw native pointers extracted via MemorySegment.address()PREVIEW. The native code that consumes them (e.g. a bulk-gather FFI downcall) will dereference these pointers directly - there is no scope or bounds check at that point. The backing memory must therefore remain valid for the entire duration of the action.

      With the current callers, the backing memory is independently kept alive: on the MSAI path, the arena is owned by the IndexInput which the caller holds as a field; on the DAI path, cache regions are ref-counted by DirectAccessInput.withByteBufferSlices(long[], int, int, org.elasticsearch.core.CheckedConsumer<java.nio.ByteBuffer[], java.io.IOException>) for the duration of the callback. However, that safety relies on implementation details of MMapDirectory and SharedBlobCacheService. The JIT is also permitted to discard local references after their last use (JLS 12.6.1), which could in theory allow the segment or buffer objects to be collected while the native call is in flight. The Reference.reachabilityFence(java.lang.Object) calls below are therefore added as low-cost defensive insurance: they make the lifetime contract explicit and protect against future changes to either the callers or the backing-memory implementations.

      Parameters:
      in - the index input
      offsets - file byte offsets for each range (caller-owned, not modified)
      length - byte length of each range (same for all)
      count - number of ranges to resolve
      action - receives a MemorySegmentPREVIEW of count pointer-width addresses, allocated from a confined arena that is closed after the action returns
      Returns:
      true if addresses were resolved and the action was invoked
      Throws:
      IOException
    • checkInputType

      public static void checkInputType(org.apache.lucene.store.IndexInput in)
      Checks that a FilterIndexInput wrapper also implements MemorySegmentAccessInput or DirectAccessInput, so that zero-copy access is preserved through the wrapper chain.