- All Implemented Interfaces:
Closeable,AutoCloseable,org.apache.lucene.util.Accountable,BytesRefHashTable,org.elasticsearch.core.Releasable
int ids to BytesRefs, vending the ids in order they are added.
At it's core there are two hash table implementations, a "small core" and a "big core". The "small core" is a simple open addressed hash table with a fixed 60% load factor and a table of 2048. It quite quick because it has a fixed size and never grows.
When the "small core" has more entries than it's load factor the "small core"
is replaced with a "big core". The "big core" functions quite similarly to
a Swisstable, Google's
fancy SIMD hash table. In this table there's a contiguous array of "control"
bytes that are either 0b1111_1111 for empty entries or
0b0aaa_aaaa for populated entries, where aaa_aaaa are the top
7 bytes of the hash. To find an entry by key you hash it, grab the top 7 bytes
or it, and perform a SIMD read of the control array starting at the expected
slot. We use the widest SIMD instruction the CPU supports, meaning 64 or 32
bytes. If any of those match we check the actual key. So instead of scanning
one slot at a time "small core", we effectively scan a whole bunch at once.
This allows us to run a much higher load factor (87.5%) without any performance
penalty so the extra byte feels super worth it.
When a "big core" fills it's table to the fill factor, we build a new "big core" nd read all values in the old "big core" into the new one.
This class does not store the keys in the hash table slots. Instead, it
uses a BytesRefArray to store the actual bytes, and the hash table
slots store the id which indexes into the BytesRefArray.
-
Nested Class Summary
Nested ClassesNested classes/interfaces inherited from class org.elasticsearch.swisshash.SwissHash
SwissHash.Status -
Field Summary
Fields inherited from class org.elasticsearch.swisshash.SwissHash
breaker, capacity, growCount, mask, nextGrowSize, recycler, sizeFields inherited from interface org.apache.lucene.util.Accountable
NULL_ACCOUNTABLE -
Method Summary
Modifier and TypeMethodDescriptionlongadd(org.apache.lucene.util.BytesRef key) Adds akey, returning itsid.voidclose()longfind(org.apache.lucene.util.BytesRef key) Finds anidby akey.org.apache.lucene.util.BytesRefget(long id, org.apache.lucene.util.BytesRef dest) Returns the key at0 <= id <= size().Returns the key array.iterator()Build an iterator to walk all values and ids.longstatus()Performance information hopefully useful for debugging.Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitMethods inherited from interface org.apache.lucene.util.Accountable
getChildResourcesMethods inherited from interface org.elasticsearch.common.util.BytesRefHashTable
size
-
Method Details
-
find
public long find(org.apache.lucene.util.BytesRef key) Finds anidby akey.- Specified by:
findin interfaceBytesRefHashTable
-
add
public long add(org.apache.lucene.util.BytesRef key) Adds akey, returning itsid. If it was already present it's previous assignedidwill be returned. If it wasn't present it'll be assigned a newid.- Specified by:
addin interfaceBytesRefHashTable
-
status
Description copied from class:SwissHashPerformance information hopefully useful for debugging. -
iterator
Description copied from class:SwissHashBuild an iterator to walk all values and ids. -
close
public void close()- Specified by:
closein interfaceAutoCloseable- Specified by:
closein interfaceCloseable- Specified by:
closein interfaceorg.elasticsearch.core.Releasable
-
get
public org.apache.lucene.util.BytesRef get(long id, org.apache.lucene.util.BytesRef dest) Returns the key at0 <= id <= size(). The result is undefined if the id is unused.- Specified by:
getin interfaceBytesRefHashTable- Parameters:
id- the id returned when the key was added- Returns:
- the key
-
getBytesRefs
Returns the key array.- Specified by:
getBytesRefsin interfaceBytesRefHashTable
-
ramBytesUsed
public long ramBytesUsed()- Specified by:
ramBytesUsedin interfaceorg.apache.lucene.util.Accountable
-