sblom / RegExtract

Clean & simple idiomatic C# RegEx-based line parser that emits strongly typed results.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Add `IEnumerable<string>` extension methods in the spirit of `567legoguy`'s `FromFormat()`

sblom opened this issue · comments

In @567legodude's Advent of Code 2020 repo, there's a slick extension method that Extract<T>()s an IEnumerable<string> into an IEnumerable<T>.

Would love to see that here. The right signatures to build out are:

    public static IEnumerable<T> Extract<T>(this IEnumerable<string> str, string rx, RegExtractOptions options = RegExtractOptions.None);
        ↓
    public static IEnumerable<T> Extract<T>(this IEnumerable<string> str, string rx, RegexOptions rxOptions, RegExtractOptions options = RegExtractOptions.None);
        ↓
    public static IEnumerable<T> Extract<T>(this IEnumerable<string> str, Regex rx, RegExtractOptions options = RegExtractOptions.None);

    public static IEnumerable<T> Extract<T>(this IEnumerable<string> str, ExtractionPlan<T> plan);

    public static IEnumerable<T> Extract<T>(this IEnumerable<string> str, RegExtractOptions options = RegExtractOptions.None)

Each of the first 2 cascade to the next one.

The fourth one takes an already created ExtractionPlan<T>, which holds all the info that RegExtract requires to execute an extraction, and avoids any extra work involved in creating a Regex or an ExtractionPlan.

The last one only requires the type T and a string and uses the GetRegexFromType() helper method to grab RegExtract configuration from the type T using reflection.

They will look a lot like the corresponding Extract<T>() methods that are in RegExtractExtensions.cs.

One important performance-related note, you should be creating an ExtractionPlan<>, and then calling it repeatedly, because it represents the completely pre-calculated approach to extraction, and doesn't require any regex parsing or reflection other than to grab the contructors or methods that it already knows it's going to call.

As far as naming goes, I'm fairly inclined to just call them Extract<T> and use overloading to resolve whether we're acting on an IEnumerable<string> or a string.

I tried to implement the extensions here. I had wrote the overload which only takes ExtractionPlan before your recent commit, so you may remove the Regex property since it's no longer needed.

Nice! Apologies for the shifting sands--I intended to clean up before you got started. Testing now, but first read of the PR looks great. Thanks for your contribution!