opencog / link-grammar

The CMU Link Grammar natural language parser

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

pool management idea...

linas opened this issue · comments

Per discussions in #1403, pertaining to this figure, which shows pool_size(sent->Table_tracon_pool) in count.c

djc-dj-cnt-scaled

Each horizontal line is a re-alloc of 16K elements. Counting the line, I see that there are 12 lines, 12 allocs that cover almost all cases.

Idea is to try to keep the pool size at about 12 chunks. This can be done in pool_reuse():

  • If more than 24 blocks, free all blocks, and double the block size.
  • If less than 4 blocks, free all blocks, and halve the block size.

This risks bouncing between two values, if the text alternates between long and short sentences: it will double and halve over and over. This can be solved by keeping an exponentially decaying block count. The formula is like this (must be done with floats):

   float decay_const = 0.9;
   avg_block_count = decay_const * avg_block_count 
                     + (1-decay_const) * curr_block_count;

The above doubling/halving is applied to the avg block count.

TBD review impact on the Parse_choice pool, where things are the most badly behaved.

Also TBD: