bullet-db / bullet-storm

The Apache Storm implementation of the Bullet backend

Home Page:https://bullet-db.github.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Add GROUP aggregation

akshaisarma opened this issue · comments

This will let us do grouping by a set of fields and then our metrics such as SUM, COUNT, MIN, MAX, AVG per group or just group to get the unique values for a set of fields (DISTINCT)

The idea is to use Tuple Sketches to sample on the unique fields - if there are too many, the Sketch takes care of randomly evicting groups.

An open question is to consider the relationship with max aggregation size config parameter - should that limit the number of groups produced or should this more relaxed?