opencog / link-grammar

The CMU Link Grammar natural language parser

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

ctxt.ct_size=0 in prepare/exprune.c

linas opened this issue · comments

@ampli A question/bug:

I'm writing new code (branch atomspace) and I (sometimes?) see that:

--- a/link-grammar/prepare/exprune.c
+++ b/link-grammar/prepare/exprune.c
@@ -418,6 +418,7 @@ void expression_prune(Sentence sent, Parse_Options opts)
 
        ctxt.opts = opts;
        ctxt.ct_size = sent->dict->contable.num_uc;
+printf("duuude ctxt.ct_size=%lu\n", ctxt.ct_size);
        ctxt.ct = malloc(ctxt.ct_size * sizeof(*ctxt.ct));
        zero_connector_table(&ctxt);
        ctxt.end_current_block->next = NULL;

the size is zero. Why would that be? I cannot reproduce this in the english dict.

num_uc is the number of different uppercase parts of connectors.
It is 0 if the dictionary doesn't contain connectors.
For example a dictionary with only this line:

test: ();

If you now parse the sentence "test" you get the problem.
Interestingly, a memory leak happens then... Should it be fixed?

I guess you use a DB dict.
This may mean that all the words of the sentence you try to parse have null expressions in the dict.

Note the following in the start of sort_condesc_by_uc_constring():

        if ((0 == dict->contable.num_con) && !IS_DB_DICT(dict))
        {
                prt_error("Error: Dictionary %s: No connectors found.\n", dict->name);
                return false;
        }

I seems I thought this may be normal when using the DB dict.

BTW, in a file dict, the memory leak in case of no connects is in get_file_contents() (I still don't know the reason). I guess it should be fixed.

I'm not using the db dict; I'm creating a new dict backend. During a call to dict->lookup_list, I create a Dict_node with a bunch of expressions on it, and return that. Some sentences even parse. The only things that's unusual is there's no UNKNOWN_WORD or left or right walls.

Seems that sort_condesc_by_uc_constring() is never called ... because condesc_setup() is never called.

In your case, the cause of ctxt.ct_size=0 is the same: No connectors at all at that point.
Try to check if this happens when all the sentence words have null expressions.

? All words always have expressions. I am not calling condesc_setup() .. do I need to? The collection of connectors is not known when the dictionary is opened; as new words are looked up, new connectors and link types may appear.

I am not calling condesc_setup()

You should call it. Among other things, it sets num_uc, which is used in expression_prune() and indirectly in power_prune().

Ahh .. I see ... IS_DB_DICT needs to be updated. OK, that will probably fix the problem.

Yep, that fixed it. Thanks!