unjs / magic-regexp

A compiled-away, type-safe, readable RegExp alternative

Home Page:https://regexp.dev

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Infer the literals used in the type of the groups

michaelschufi opened this issue Β· comments

πŸ†’ Your use case

When I create a regexp group using anyOf and groupedAs, the type of the matched group should be inferred based on the inputs to anyOf.

Similar to the type of the main regexp that gets inferred.

E.g.

const regex = createRegExp(
  anyOf("A", "B", "C")
    .groupedAs("opponent")
    .and(" ")
    .and(anyOf("X", "Y", "Z").groupedAs("self")),
);

Here, the type of regex is

MagicRegExp<"/(?<opponent>A|B|C) (?<self>X|Y|Z)/", "opponent" | "self", ["(?<opponent>A|B|C)", ...any[]], never>

so the information that the groups have literals of ABC and XYZ respectively is included in the type. However, the type of

"A Y".match(regex)?.groups.opponent

is string | undefined instead of "A" | "B" | "C".

πŸ” Alternatives you've considered

No response

ℹ️ Additional info

My goal is to use this library in conjunction with colinhacks/zod. magic-regexp complements zod well when it comes to string validation and parsing while still being typesafe.

I'm open to work on this myself. Please let me know where I should have a look to get started with such a feature.

This is definitely a worthy enhancement. I think @didavid61202 might be working on this very thing (so let's wait to hear from him), but if not then contribution would be very welcome ❀️

Hi @michaelschufi,

Yes, I've been working on features similar to this, although might be in very different approach, it's a new type inferencing (fully parse, interpret, exact match, permutation all possible values) for string.match, string.replace, string.matchAll and string.replaceAll,

a little peek at what's possible (blue text are type hints from vscode-twoslash-queries), still testing out with pure string literal:
type_level_regexp

Still have to finish writing more robust test cases and testing for some edge cases.

Yes, I've been working on a new type inferencing (fully parse, interpret, exact match, permutation all possible values) for string.match, string.replace, string.matchAll and string.replaceAll

Sounds awesome! πŸ˜€
Of course, if the other functions can also benefit of the inference, it's even better. I haven't worked yet with .replace with this library, but this sounds really game changing. It makes it so much easier to work with string manipulation of all kind.

The inference looks great so far. Tell me if I can help with some testing or similar.

Sounds awesome! πŸ˜€

Of course, if the other functions can also benefit of the inference, it's even better. I haven't worked yet with .replace with this library, but this sounds really game changing. It makes it so much easier to work with string manipulation of all kind.

The inference looks great so far. Tell me if I can help with some testing or similar.

Definitely would love to have some helps on testing and improving it! ❀️

Will let you know as soon as I finish wrapping it up. πŸ‘

Hi @michaelschufi, Here's the repository for the type-level RegExp parser and matcher. Check out examples in the 'tests/index.test-d.ts' file and others when you have some time. Please let me know if you encounter any issues or have any ideas. Thank you! πŸ˜„

πŸ‘‰ https://github.com/didavid61202/type-level-regexp

HI @didavid61202
Thank you. I've got a lot on my plate right now, so it might take me a moment. But I'll try to have a look at it this weekend.

Sorry for my late response @didavid61202
I have tested the repo lately, and I couldn't find any breaking issue. I'm in awe of what this thing can do! 😍

Only one negative thing that I noticed: It seems to push either VS Code or the typescript language server to their limits. But this could be because of my pc, VS Code, TS, TLS, the repo itself with many complex examples, or any combination of those things...

I have a few minor things that I would like to discuss. E.g. what the restInput key represents. Should I file an issue in your repo?

Hi @michaelschufi, thanks!
Yes, I think this is pushing the language server to its limit by both lots of recursions and also lots of test examples. πŸ˜‚ There are some performance issues that can be improved by a better parser and match algorithm. I've been testing some benchmark methods and I will start by rewriting the ParseRegExp generic. Any inputs regarding performance improvement are welcome!

Do file issues if you have any questions πŸ‘
The restInput generic argument in RegExpMatchResult represents the rest of input string that is not matched by the regexp provided, it's used for resolving the preceding string before the matching string, and getting the preceding string length for index value of the match result.

Hi @didavid61202
About performance:
I've just read about the TypeScript 5.1 Beta's release notes and came across the Negative Case Checks for Union Literals section. Maybe those optimizations also affect the regex type literals.

Hi @michaelschufi, thanks for the info! πŸ‘
I'll have to check if this affect the current implementation, or how we can utilize this to improve the performance