space-saving Made by Byron Knoll in 2013 http://code.google.com/p/space-saving/ This is a C++ implementation of the "space-saving" algorithm described in: A. Metwally, D. Agrawal, and A. El Abbadi. Efficient Computation of Frequent and Top-k Elements in Data Streams. In Proceedings of the 10th ICDT International Conference on Database Theory, pages 398–412, 2005. This project is released in the public domain - you can use the source code however you want. The example program (runner.cpp) finds the most frequently occurring substrings of length N in a file. To compile: make Run without parameters to get help: ./space-saving Example execution: ./space-saving file.txt 10 100000 100