drmfinlay / pyjsgf

JSpeech Grammar Format (JSGF) compiler, matcher and parser package for Python.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Load JSGF grammar from file?

warp1337 opened this issue · comments

I quickly went through the code and I couldn't find any class or function that loads
a JSGF grammar from a file.

Here's the use case: I already have a JSGF grammar, but my speech recognition component does not
support grammars, e.g., deep speech. Thus, my SR outputs strings --- now I'd like to use pyjsgf
to match the output strings against my existing grammar that already exists in a file.

Is there a way of doing this?

Unfortunately no, there isn't a way to do this yet. I ran into a similar problem today actually. The way to do it would be to write a JSGF parser to read grammar files and populate Grammar objects with rules. I'll see what I can do about this.

That would be awesome! Thanks

Okay, I have a work in progress parser up on the feat/parser branch. It's actually mostly done, I'm just having trouble getting expansions parsed properly.
By the way, I've switched to using setuptools in setup.py so that the pyparsing package can be installed automatically. You might need to remove the old distutils version of pyjsgf.

Is there something I do wrong or is the parser not able to parse rules yet? Always getting an issue.

No it doesn't work yet unfortunately. I'm going to get it working soon, it's been an issue for way too long.

Parsers for grammar files, grammar strings, individual rules and rule expansions are complete and are in the develop branch. Sorry this took so long @warp1337 @embie27! I hope you find it useful. Let me know if you have any issues.

An example for using parse_grammar_string:

from jsgf import parse_grammar_string
parse_grammar_string("#JSGF V1.0 UTF-8 en;"
                     "grammar example;"
                     "public <greet> = hello world;")

A Grammar object with the name example containing the greet rule will be returned. You can use parse_grammar_file(path) to parse grammar files.

I've tried to stick to the JSGF spec while implementing this. C++ style comments (/* ... */ and // ...) are supported. I'll have some examples up soon and hopefully a readthedocs project.

I'll release this in v1.4.0 in the next week, I want to add a few more things first, such as an import resolver and make the character set attribute in the grammar header useful.

Sorry I took almost 3 weeks to release v1.4.0. I'll make separate issues for the import resolver and grammar header attributes for later releases.

I wanted to say here that the parser has limitations with long sequences and alternative sets. If a sequence is very long or an alternative set has a lot of alternatives (between 50-100 expansions/alternatives from my testing), the parser will crash with a RecursionError. This is a consequence of how sequences and alternative sets are parsed and how pyparsing does the recursive descent. The workaround is to split problematic sequences and alternative sets into multiple rules and use rule references instead. I will also make a note about this in the documentation when I get around to writing it.

I think this issue is resolved now.