java.lang.Object
org.elasticsearch.tdigest.TDigest
org.elasticsearch.tdigest.AbstractTDigest
org.elasticsearch.tdigest.HybridDigest
- All Implemented Interfaces:
Closeable,AutoCloseable,org.apache.lucene.util.Accountable,Releasable,TDigestReadView
Uses a
SortingDigest implementation under the covers for small sample populations, then switches to MergingDigest.
The SortingDigest is perfectly accurate and the fastest implementation for up to millions of samples, at the cost of increased
memory footprint as it tracks all samples. Conversely, the MergingDigest pre-allocates its memory (tens of KBs) and provides
better performance for hundreds of millions of samples and more, while accuracy stays bounded to 0.1-1% for most cases.
This hybrid approach provides the best of both worlds, i.e. speedy and accurate percentile calculations for small populations with
bounded memory allocation and acceptable speed and accuracy for larger ones.-
Field Summary
Fields inherited from interface org.apache.lucene.util.Accountable
NULL_ACCOUNTABLE -
Method Summary
Modifier and TypeMethodDescriptionvoidadd(double x, long w) Adds a sample to a histogram.voidadd(TDigestReadView other) Add all of the centroids of another digest to this one.intbyteSize()Returns the number of bytes required to encode this TDigest using #asBytes().doublecdf(double x) Returns the fraction of all points added which are ≤ x.intReturns the current number of centroids.ACollectionthat lets you go through the centroids in ascending order by mean.voidclose()voidcompress()Re-examines a t-digest to determine whether some centroids are redundant.doubleReturns the current compression factor.doublegetMax()Returns the maximum value seen.doublegetMin()Returns the minimum value seen.doublequantile(double q) Returns an estimate of a cutoff such that a specified fraction of the data added to this TDigest would be less than or equal to the cutoff.longvoidreserve(long size) Prepare internal structure for loading the requested number of samples.longsize()Returns the number of points that have been added to this TDigest.Methods inherited from class org.elasticsearch.tdigest.TDigest
add, createAvlTreeDigest, createHybridDigest, createMergingDigest, createSortingDigest, setScaleFunctionMethods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitMethods inherited from interface org.apache.lucene.util.Accountable
getChildResources
-
Method Details
-
ramBytesUsed
public long ramBytesUsed() -
add
public void add(double x, long w) Description copied from class:TDigestAdds a sample to a histogram. -
add
Description copied from class:TDigestAdd all of the centroids of another digest to this one.- Overrides:
addin classAbstractTDigest- Parameters:
other- The other digest
-
reserve
public void reserve(long size) Description copied from class:TDigestPrepare internal structure for loading the requested number of samples. -
compress
public void compress()Description copied from class:TDigestRe-examines a t-digest to determine whether some centroids are redundant. If your data are perversely ordered, this may be a good idea. Even if not, this may save 20% or so in space. The cost is roughly the same as adding as many data points as there are centroids. This is typically < 10 * compression, but could be as high as 100 * compression. This is a destructive operation that is not thread-safe. -
size
public long size()Description copied from interface:TDigestReadViewReturns the number of points that have been added to this TDigest.- Returns:
- The sum of the weights on all centroids.
-
cdf
public double cdf(double x) Description copied from class:TDigestReturns the fraction of all points added which are ≤ x. Points that are exactly equal get half credit (i.e. we use the mid-point rule) -
quantile
public double quantile(double q) Description copied from class:TDigestReturns an estimate of a cutoff such that a specified fraction of the data added to this TDigest would be less than or equal to the cutoff. -
centroids
Description copied from interface:TDigestReadViewACollectionthat lets you go through the centroids in ascending order by mean. Centroids returned will not be re-used, but may or may not share storage with this TDigest.- Returns:
- The centroids in the form of a Collection.
-
compression
public double compression()Description copied from class:TDigestReturns the current compression factor.- Specified by:
compressionin classTDigest- Returns:
- The compression factor originally used to set up the TDigest.
-
centroidCount
public int centroidCount()Description copied from interface:TDigestReadViewReturns the current number of centroids. -
getMin
public double getMin()Description copied from interface:TDigestReadViewReturns the minimum value seen. Returns NaN if this digest is empty.- Specified by:
getMinin interfaceTDigestReadView- Overrides:
getMinin classTDigest
-
getMax
public double getMax()Description copied from interface:TDigestReadViewReturns the maximum value seen. Returns NaN if this digest is empty.- Specified by:
getMaxin interfaceTDigestReadView- Overrides:
getMaxin classTDigest
-
byteSize
public int byteSize()Description copied from class:TDigestReturns the number of bytes required to encode this TDigest using #asBytes(). -
close
public void close()
-