iwater / statistics

Statistics tools

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

use for top N
  |group_sum == |sort|uniq -c|sort -nr

performance

  sample data is 1583232 rows with 621 unique value (20x faster)

  time ./bin/group_sum < sample.log > /dev/null

real    0m0.766s
user    0m0.758s
sys     0m0.008s

  time cat sample.log|sort|uniq -c|sort -nr > /dev/null

real    0m15.865s
user    0m16.714s
sys     0m0.120s

use for merge two files with same key as 1st column
  cat one.log|fill_pipe -f two.log

use for unique
  |uniq

About

Statistics tools


Languages

Language:C++ 98.2%Language:Makefile 1.8%