Compressed bitset in C++ Daniel Lemire == What is this? == The class EWAHBoolArray is a compressed bitset data structure. == Licensing == Apache License 2.0. (Other licenses are possible.) == Limitations == Because of the compression type being used, you must set the bits in increasing order (no random access). == STL and copy constructors == I expect most people to construct rather large bitmaps. For this reason, you should avoid copies. Thus, the code will warn you against doing this: EWAHBoolArray<uword> bitset1; bitset1.set(1); bitset1.set(2); bitset1.set(1000); bitset1.set(1001); vector< EWAHBoolArray<uword> > testVec; testVec.push_back(bitset1); Instead, do this: vector< EWAHBoolArray<uword> > testVec(1); testVec[0].set(1); testVec[0].set(2); testVec[0].set(1000); testVec[0].set(1001); Or you can use the "swap" method. EWAHBoolArray<uword> bitset1; bitset1.set(1); bitset1.set(2); bitset1.set(1000); bitset1.set(1001); vector< EWAHBoolArray<uword> > testVec(1); testVec.swap(bitset1); == Dependencies == None. (Will work under MacOS or Linux easily.) == Usage == make ./unit make example ./example == Example == Please see example.cpp == Ruby wrapper == Josh Ferguson wrote a wrapper for Ruby. The implementation is packaged and installable as a ruby gem. You can install it by typing: gem install ewah-bitset == Further reading == Please see Daniel Lemire, Owen Kaser, Kamel Aouiche, Sorting improves word-aligned bitmap indexes. Data & Knowledge Engineering 69 (1), pages 3-28, 2010. http://arxiv.org/abs/0901.3751 == Warning == Please don't trust this software. Run your own unit tests. Report bugs.