sebastienros / parlot

Fast and lightweight parser creation tools

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Add a max argument to Many methods.

ashscodes opened this issue · comments

commented

Opening an issue as per the recommendation here

The idea would be to add a max argument to any Many method to provide an upper bound on matches.

Suggested:

OneOrMany<T>(Parser<T> parser, int max = 0) // default unlimited
ZeroOrMany<T>(Parser<T> parser, int max = 0) // default unlimited

It would also benefit Parser<List<T>> Separated<U, T>(Parser<U> separator, Parser<T> parser) if a count or even min and max parameters were allowed.

Parser<List<T>> Separated<U, T>(Parser<U> separator, Parser<T> parser, int count=0)

It would stop when 'count' entries were found.

Parser<List<T>> Separated<U, T>(Parser<U> separator, Parser<T> parser, int min=0, int max=Int32.MaxValue)

It would fail if it didn't find min separated entries, or if it found more than max separated entries...

After inspecting the Separated parser source code I found it very simple and intuitive and then I went all the way and implemented SeparatedBy which works exactly like Separated but accepts minOccurs and maxOcurrs parameters to control minimum and maximum cardinality. I just didn't implemented compilation, since I didn't need it, but it shouldn't be too complicated. Here is the full source:

using Parlot.Rewriting;

namespace Parlot.Fluent
{
    public sealed class SeparatedBy<U, T> : Parser<List<T>>, ISeekable
    {
        private readonly Parser<U> _separator;
        private readonly Parser<T> _parser;
        private readonly int _minOccurs;
        private readonly int _maxOccurs;

        public SeparatedBy(Parser<U> separator, Parser<T> parser, int minOccurs = 0, int maxOccurs = 0) {
            _separator = separator ?? throw new ArgumentNullException(nameof(separator));
            _parser = parser ?? throw new ArgumentNullException(nameof(parser));
            if (maxOccurs > 0 && maxOccurs < minOccurs) throw new ArgumentOutOfRangeException(nameof(maxOccurs));
            if (maxOccurs == 0 && minOccurs > 0) maxOccurs = int.MaxValue;
            _minOccurs = minOccurs;
            _maxOccurs = maxOccurs;
        }

        public bool CanSeek => _parser is ISeekable seekable && seekable.CanSeek;

        public char[] ExpectedChars => _parser is ISeekable seekable ? seekable.ExpectedChars : default;

        public bool SkipWhitespace => _parser is ISeekable seekable && seekable.SkipWhitespace;

        public bool HasMax => _maxOccurs > 0;

        public bool HasMin => _minOccurs > 0;

        public override bool Parse(ParseContext context, ref ParseResult<List<T>> result) {
            context.EnterParser(this);

            List<T>? results = null;

            var start = 0;
            var end = context.Scanner.Cursor.Position;
            var resetStart = end;

            var first = true;
            var parsed = new ParseResult<T>();
            var separatorResult = new ParseResult<U>();
            var occur = 0;

            while (!HasMax || occur < _maxOccurs) {
                if (!first) {
                    if (!_separator.Parse(context, ref separatorResult)) {
                        break;
                    }
                }

                if (!_parser.Parse(context, ref parsed)) {
                    if (!first) {
                        // A separator was found, but not followed by another value.
                        // It's still succesful if there was one value parsed, but we reset the cursor to before the separator
                        context.Scanner.Cursor.ResetPosition(end);
                        break;
                    }

                    return false;
                } else {
                    end = context.Scanner.Cursor.Position;
                }

                if (first) {
                    results = new List<T>();
                    start = parsed.Start;
                    first = false;
                }

                results.Add(parsed.Value);
                occur++;
            }

            if (HasMin && results?.Count < _minOccurs) {
                context.Scanner.Cursor.ResetPosition(resetStart);
                result = new ParseResult<List<T>>(start, resetStart.Offset, null);
                return false;
            }

            result = new ParseResult<List<T>>(start, end.Offset, results);
            return true;
        }
    }
}

The same logic could be added to other "Many" parsers!

@loudenvier While adding some tests on Separated I saw that the doc was stating that no value was valid but it's not. Would you want to provide a PR to add your suggestions? Would require to have the compilation to work too, but if you start the PR without it I can help, or explain how it works, it's quite simple, I am also rediscovering how it works every time I need to touch this part ;)

@sebastienros Yes, sir! I'll submit a PR over the weekend (too much work at work right now :-)