Several problems with `foma2js.perl` / `foma2js.py`
dhdaines opened this issue · comments
The JavaScript generated by these scripts, while it works, is not really correct (this is 90% of all JavaScript code in the world, so don't feel bad). It seems that the intention here is to create separate Array
s for transitions, alphabet, and finals, or perhaps put them all in the same Array
?
var myNet = new Object;
myNet.t = Array;
myNet.f = Array;
myNet.s = Array;
Regardless, this doesn't do either of those things, because you didn't add the magical new
operator. It just sets properties on the global builtin Array
object. This is likely to cause random problems for any other JavaScript code that is loaded with the FST. Also, it makes it impossible to serialize myNet
to JSON. You shouldn't be using an Array
for these in the first place because JSON.stringify
can't enumerate string keys on an array, even if JavaScript, in its infinite wisdom, lets you use them. Instead, I suggest doing this (PR coming soon):
var myNet = new Object;
myNet.t = new Object;
myNet.f = new Object;
myNet.s = new Object;
(yes, you could make them all the same Object
since the key names are unique, but I don't see a good reason to do this)
Also, foma2js.py
misses some symbols in the alphabet - I'll just fix this in the PR to come.
Note that foma_apply_down.js
actually only needs to know the input symbols in the alphabet, so if you want to save some space, you can omit the output symbols from the s
array.
Note also that maxlen
is wrong when there are surrogate pairs, I've fixed this in the pyfoma
implementation :) (and in #156 too)
(note, actually, the optimal solution is not to output JavaScript at all, but just to output JSON that you assign to a JavaScript object, like the pyfoma
implementation does)