qtdigest

python implementation of Dunning's T-Digest, inspired by nodejs tdigest

Install

pip install qtdigest

from qtdigest import Tdigest

t = Tdigest()
for i in xrange(1000):
    t.push(random())
P90 = t.percentile(0.9)
print 'P90 = ', P90

delta: the compression factor, the max fraction of mass that can be owned by one centroid (bigger, up to 1.0, means more compression).
K: a size threshold that triggers recompression as the TDigest grows during input
CX: specifies how often to update cached cumulative totals used for quantile estimation during ingest.
return: Tdigest instance

push(x, n): add data with value x and weight n
size(): return the count of centroids
toList(): return the list of all centroids data
percentile(p): return the percentage of p(0..1)
serialize(): serialize tdigest instance to string, ie: 0.01~25~2~0.00064~0.0013~2~20
simpleSerialize(): simply serialize tdigest instance to string, ie: 0.00064~2~0.0013~20
deserialize(serialized_str): deserialize the serialized string to tdigest instance. it is a classmethod, so can be called by Tdigest.deserialize(serialized_str)

platform： MacBook Pro (2.6 GHz Intel Core i5)

tdigest for python

Language:Python 100.0%