Infer the literals used in the type of the groups
michaelschufi opened this issue Β· comments
π Your use case
When I create a regexp group using anyOf
and groupedAs
, the type of the matched group should be inferred based on the inputs to anyOf
.
Similar to the type of the main regexp that gets inferred.
E.g.
const regex = createRegExp(
anyOf("A", "B", "C")
.groupedAs("opponent")
.and(" ")
.and(anyOf("X", "Y", "Z").groupedAs("self")),
);
Here, the type of regex
is
MagicRegExp<"/(?<opponent>A|B|C) (?<self>X|Y|Z)/", "opponent" | "self", ["(?<opponent>A|B|C)", ...any[]], never>
so the information that the groups have literals of ABC and XYZ respectively is included in the type. However, the type of
"A Y".match(regex)?.groups.opponent
is string | undefined
instead of "A" | "B" | "C"
.
π Alternatives you've considered
No response
βΉοΈ Additional info
My goal is to use this library in conjunction with colinhacks/zod. magic-regexp complements zod well when it comes to string validation and parsing while still being typesafe.
I'm open to work on this myself. Please let me know where I should have a look to get started with such a feature.
This is definitely a worthy enhancement. I think @didavid61202 might be working on this very thing (so let's wait to hear from him), but if not then contribution would be very welcome β€οΈ
Hi @michaelschufi,
Yes, I've been working on features similar to this, although might be in very different approach, it's a new type inferencing (fully parse, interpret, exact match, permutation all possible values) for string.match
, string.replace
, string.matchAll
and string.replaceAll
,
a little peek at what's possible (blue text are type hints from vscode-twoslash-queries
), still testing out with pure string literal:
Still have to finish writing more robust test cases and testing for some edge cases.
Yes, I've been working on a new type inferencing (fully parse, interpret, exact match, permutation all possible values) for string.match, string.replace, string.matchAll and string.replaceAll
Sounds awesome! π
Of course, if the other functions can also benefit of the inference, it's even better. I haven't worked yet with .replace
with this library, but this sounds really game changing. It makes it so much easier to work with string manipulation of all kind.
The inference looks great so far. Tell me if I can help with some testing or similar.
Sounds awesome! π
Of course, if the other functions can also benefit of the inference, it's even better. I haven't worked yet with
.replace
with this library, but this sounds really game changing. It makes it so much easier to work with string manipulation of all kind.The inference looks great so far. Tell me if I can help with some testing or similar.
Definitely would love to have some helps on testing and improving it! β€οΈ
Will let you know as soon as I finish wrapping it up. π
Hi @michaelschufi, Here's the repository for the type-level RegExp parser and matcher. Check out examples in the 'tests/index.test-d.ts' file and others when you have some time. Please let me know if you encounter any issues or have any ideas. Thank you! π
HI @didavid61202
Thank you. I've got a lot on my plate right now, so it might take me a moment. But I'll try to have a look at it this weekend.
Sorry for my late response @didavid61202
I have tested the repo lately, and I couldn't find any breaking issue. I'm in awe of what this thing can do! π
Only one negative thing that I noticed: It seems to push either VS Code or the typescript language server to their limits. But this could be because of my pc, VS Code, TS, TLS, the repo itself with many complex examples, or any combination of those things...
I have a few minor things that I would like to discuss. E.g. what the restInput
key represents. Should I file an issue in your repo?
Hi @michaelschufi, thanks!
Yes, I think this is pushing the language server to its limit by both lots of recursions and also lots of test examples. π There are some performance issues that can be improved by a better parser and match algorithm. I've been testing some benchmark methods and I will start by rewriting the ParseRegExp
generic. Any inputs regarding performance improvement are welcome!
Do file issues if you have any questions π
The restInput
generic argument in RegExpMatchResult
represents the rest of input string that is not matched by the regexp provided, it's used for resolving the preceding string before the matching string, and getting the preceding string length for index
value of the match result.
Hi @didavid61202
About performance:
I've just read about the TypeScript 5.1 Beta's release notes and came across the Negative Case Checks for Union Literals section. Maybe those optimizations also affect the regex type literals.
Hi @michaelschufi, thanks for the info! π
I'll have to check if this affect the current implementation, or how we can utilize this to improve the performance