Looking for advice on unsuccesful compilation on out of memory error
GoogleCodeExporter opened this issue · comments
Google Code Exporter commented
Hi there, I've been implementing a morphological analyzer for a complicated
language, mapudungun, I'm dealing with verb forms which basic form is
"root/stem+suffixes" stems can be compisitions, reduplications or "basic verb
forms".
suffixes are about 100 filling 36 slots, not all the slots are filled of
course, reasons are prohibition: if one suffix A appears B is not present;
obligation: if A appears B also does it; dependence: B needs A to appear,
sequence: if A appears needs the sequence BC to appears also....
At this moment I'm dealing with prohibition only, which is the most extensive
part, and even though I've tried many ways I can not make it compile with the
prohibition rules, without them the file compiles and it works, it gives a lot
of possible analyses many of which are wrong because of lacking of restrictions
rules...
So, my questions is if you guys, gurus of fst, can have a look at my file, no
need to be a deep look, and give me some advice on what should I change or what
could I try, that would be great and I would be thankful for ever...
I attach my file, thanks again
Original issue reported on code.google.com by andreschandiaf
on 11 Mar 2015 at 11:21
Attachments:
Google Code Exporter commented
Hi, I have split all the prohibition rules and I have found many rules that can
be avoided or summarized into one, this operation gave me 193 rules, and I
still can not compile with all of them, I arrive to apply 117, but when I
enable the 118 the compilation crashes because of lack of memory, I attach the
new version of the file, maybe this can give a clue, that I can not find....
thanks again.
Original comment by andreschandiaf
on 13 Mar 2015 at 8:27
Attachments:
Google Code Exporter commented
This seems to be an illustration of a common problem with multiple rules or
constraints. In general, there is the danger of an exponential growth in the
size of a combined set of rules of constraints, since each composition or
intersection of an independent rule or constraint may cause the resulting
automaton/transducer grow by a constant factor.
The way to avoid this is to include a lexicon *before* the first
composition/intersection with the constraints/rules.
For example, in a traditional rewrite-rule grammar, it is always recommended
that one use the design:
define Grammar Lexicon .o. Rule1 .o. Rule2 .o. ... .o. RuleN;
instead of:
define Rules Rule1 .o. Rule2 .o. ... .o. RuleN ;
define Grammar Lexicon .o. Rules;
since the second option may produce an intermediate result that is very large
(Rules), while the final result (Grammar) could still be very small.
For the current grammar, my recommendation is that you focus on introducing
VERBFORM before the composition of the constraints. How this can be done
depends on how you designed the grammar. If VERBFORM is an automaton/acceptor
(which it looks like it is), you can just create that first, and do:
define PrRu VERBFORM .o. PrRu001 .o. PrRu002 .o. ...
since the order of composition doesn't matter in that case.
If, on the other hand, VERBFORM is a transducer, you may have to invert the
constraint process and do it maybe like this:
define PrRu [VERBFORM.i .o. PrRu193 .o. PrRu192 ... ].i;
The main point is that you should never do a large composition/intersection of
rules/constraints freely, but tie it to a lexicon first by introducing the
lexicon as the first element in a chain of compositions.
See also:
https://code.google.com/p/foma/wiki/FAQ#It_takes_forever_to_compose_all_the_rewr
ite_rules_together_in_my
Original comment by mans.hul...@gmail.com
on 14 Mar 2015 at 7:18
Google Code Exporter commented
Thanks, after reordering some stuff and applying this strategy: define PrRu
[VERBFORM.i .o. PrRu193 .o. PrRu192 ... ].i; the thing worked, but
reduplication rules does not apply any more, I guess I have to find the
appropriate place for them, but I haven't yet....
well, thanks again, and here you have the resulting file in case of further
advice ;)
Original comment by andreschandiaf
on 17 Mar 2015 at 3:17
Attachments:
Google Code Exporter commented
Well, here I am again, as I told you at the previous message, the reduplication
rules do not apply any more, I have tried many different things but I can not
bring them into life again, for sure I'm not doing it well, can you please take
again a look to my file and advice me what should I try, sorry but this is
driving me a little nut, I attach the file again because it has a lot of
changes....
Thanks and sorry again.
Original comment by andreschandiaf
on 19 Mar 2015 at 7:15
Attachments: