esheldon / esutil

A variety of python utilities focusing on numerical, scientific, and astrophysical computing

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Going beyond the current limitation on HTM

GoogleCodeExporter opened this issue · comments

What steps will reproduce the problem?
1. The depth limit of 13 is purely because of the use of 32 bit integers.
2. The underlying C code, if modified to use a long int or a 64 bit will solve 
the problem.
3. The rest of the package many not require specific change.

What is the expected output? What do you see instead?
At depth 14 or beyond, one starts getting negative HTM IDs, this is due to 
overflow of the 32 bit limit on the integer used for computing the same.

What version of the product are you using? On what operating system?
0.5.X, Ubuntu 12.04 LTS, Python 2.7.3

Please provide any additional information below.
None.


Original issue reported on code.google.com by Kaustubh...@gmail.com on 24 Sep 2014 at 5:58

thanks very much for suggesting this improvement.

I did an initial version with 64 bit and it works fine.  This is now in trunk, 
please give it a try.

HOWEVER:  The problem is memory usage in match() due to using the "reverse 
indices" approach for tree lookup. At depth > 13 the sparse array is just too 
large.

I plan to move away from reverse indices to use a simple std::map which should 
solve the memory problem.

If all you need is to lookup ids, I think current trunk will work fine for you 
but if you need match() you should wait for the new version.

Original comment by erin.sheldon@gmail.com on 24 Sep 2014 at 6:14

I have moved to using a red/black tree internally which should cut the memory 
usage dramatically.

The match code should be efficient now at all depths.

Original comment by erin.sheldon@gmail.com on 28 Sep 2014 at 3:58

Original comment by erin.sheldon@gmail.com on 28 Sep 2014 at 3:59

  • Changed state: Fixed