flightaware / speedtables

Speed tables is a high-performance memory-resident database. The speed table compiler reads a table definition and generates a set of C access routines to create, manipulate and search tables containing millions of rows. Currently oriented towards Tcl.

Home Page:https://flightaware.github.io/speedtables/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

add support for deduplication on varstring column types

bovine opened this issue · comments

Boost supports a "flyweight" template that would allow easily implementation of deduplication of string values, which could provide a significant memory size reduction if values tend to be repeated a lot. (Basically allowing your tables to be denormalized but without the full storage overhead.)

http://www.boost.org/doc/libs/1_51_0/libs/flyweight/doc/tutorial/basics.html

The column could be defined as something like:
varstring filename notnull 1 default "" dedupe 1

Speedtables is not in C++, so this would be kind of tricky.

My new "cpp" branch of speedtables is C++ and uses boost :)

That sounds... exciting? Adventurous? :)

Please please please keep C as an option if you move to C++. Deployment is much simpler, at least for binary distributions on Windows platforms.

As an aside, when working from Tcl, deduplication should be easy enough for the application by defining field as tclobj