kswoll / npeg

This parser is an implementation of a Packrat Parser with support for left-recursion. The algorithm for left recursion is a modified version of Packrat parsers can support left recursion.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Capture causes the failure of parsing

xanatos opened this issue · comments

I'm trying to create an importer for the unicode NameList.txt (http://www.unicode.org/Public/UNIDATA/NamesList.html the grammar and https://www.unicode.org/Public/UCD/latest/ucd/NamesList.txt the data). I've been able to do it using your peg 2.0.0 library but there are two bugs in your library/features I didn't comprehend that I had to work around.

I'm using Capture() to give names to expressions. In some points of the code, if I introduce a Capture(), the parser stops parsing without giving me an error. In the code I've written I've found three points where it happens.

Source code: KlcImporter.zip

change

    public virtual Expression NameList() => TitlePageContainer() + (-ExtendedBlock()).Capture(nameof(KlcImporter.NameList.ExtendedBlock));

to

    public virtual Expression NameList() => (-TitlePage()).Capture(nameof(KlcImporter.NameList.TitlePage)) + (-ExtendedBlock()).Capture(nameof(KlcImporter.NameList.ExtendedBlock));

or to

    public virtual Expression NameList() =>TitlePageContainer().Capture(nameof(KlcImporter.NameList.TitlePage)) + (-ExtendedBlock()).Capture(nameof(KlcImporter.NameList.ExtendedBlock));

or change

        (Tab() + 'x'._() + Sp() + '('._() + LazyLcNameForCrossRef() + CrossRefBaseExpression()) |

to

        (Tab() + 'x'._() + Sp() + '('._() + LazyLcNameForCrossRef().Capture(nameof(KlcImporter.CrossRef.LcName2)) + CrossRefBaseExpression()) |

or change

    public virtual Expression BlockHeader() => "@@"._() + Tab().Capture(nameof(KlcImporter.BlockHeader.Tab1)) + BlockStart() + Tab().Capture(nameof(KlcImporter.BlockHeader.Tab2)) + BlockName() + Tab() + BlockEnd() + Lf();

to

    public virtual Expression BlockHeader() => "@@"._() + Tab().Capture(nameof(KlcImporter.BlockHeader.Tab1)) + BlockStart() + Tab().Capture(nameof(KlcImporter.BlockHeader.Tab2)) + BlockName() + Tab().Capture(nameof(KlcImporter.BlockHeader.Tab3)) + BlockEnd() + Lf();

In all the three cases, the file won't be completely parsed.