emojicode / emojicode

πŸ˜€πŸ˜œπŸ”‚ World’s only programming language that’s bursting with emojis

Home Page:https://emojicode.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Regular Expressions in Emojicode

joeskeen opened this issue Β· comments

⭐️ Proposed change

Most languages have some kind of built-in support or library for using regular expressions. It would be great to see this feature in Emojicode.

πŸ€” Rationale

A lot of recreational coding I've tried to do in Emojicode have been programming puzzles, many of which are solved more easily using regular expressions than other methods.

πŸ•Ί Example

I'm not sure what it would look like, whether it would be string based or its own syntax like in JavaScript.

I've been thinking more about this lately, and wonder if we could implement a package for regular expressions, rather than having it built-in to the language.

There are a few guides out there for writing a RegEx engine

We could also port an existing RegEx engine from another language.

Another (perhaps simpler) option would be to make a package that wraps/links to a C++ implementation.

commented

Sure, regular expressions don't require language support. Although a literal syntax for regular expressions is nice, it isn't necessary. Since the C++ standard library has regular expression support, wrapping that should be a straightforward way to implement this.

I'm currently playing around with implementing an EmojiCode-native regular expressions library. I'm following this guide: https://kean.blog/post/lets-build-regex and taking inspiration from emogex (from the comment above) and emojex, while trying to make it feel as EmojiCode-native and natural as possible. When I have more to share, I will (I'm in very early stages of writing the parser in EmojiCode). I'm currently stuck by #204.

I've taken another stab at it, and I'm happy to say I have a very early working alpha of regular expressions for EmojiCode. You can check it out here: https://gist.github.com/joeskeen/98c9f0e9d04cd6f32d27015e1b88b589. Please note that its feature set is not complete when compared to some other languages, but it does work with a lot of the standard regex use cases.

All special characters in this implementation of regex are emoji, and each emoji was chosen to align with similar concepts in EmojiCode (there's a table explaining the syntax at the bottom of the Gist). Here's my sample usage test file for anyone's edification if they would like to try it out:

πŸ“œ πŸ”€regex.πŸ‡πŸ”€

πŸπŸ‡
    πŸ”€πŸ‡heπŸ€œβŒβŒπŸ”‘πŸ‘βŒβŒπŸ”’πŸ‘βŒβŒπŸ‡πŸ€›πŸ”˜o a🀜catπŸ‘dogπŸ€›πŸΊβŒβŒβš«πŸ¬b🍺RπŸ”˜βŒβŒπŸ”‘βŒβŒπŸ”’πŸΊβŒβŒπŸ‰πŸΏ0123πŸ†βšͺπŸ”˜AAAπŸ‰πŸ”€ ➑️ pattern
    πŸ”€heπŸ‡0eo adogcatdogdog bbbbbbbbbbbbj009πŸ‰2~~AAAπŸ”€ ➑️ string

    πŸ†•πŸ”­β— ➑️ regex
    πŸ˜€ πŸ”€Searching...
    string   '🧲string🧲' 
    pattern  '🧲pattern🧲'πŸ”€ ❗
    πŸ‘€regex pattern string❗ ➑️ result
    β†ͺ️ πŸ‘Œresult❓ πŸ‡
        πŸ˜€ πŸ”€Success at index 🧲🍺🐽resultβ“πŸ§²πŸ”€ ❗
    πŸ‰ πŸ™… πŸ‡
        πŸ˜€ πŸ”€No match foundπŸ”€ ❗
    πŸ‰
πŸ‰

For you RegEx buffs out there, the pattern I'm using is equivalent to

^he(\w|\d|\πŸ‡)*o a(cat|dog)+\s?b+R*\w\d+\πŸ‰[0123].*AAA$

This outputs:

Searching...
    string   'heπŸ‡0eo adogcatdogdog bbbbbbbbbbbbj009πŸ‰2~~AAA' 
    pattern  'πŸ‡heπŸ€œβŒπŸ”‘πŸ‘βŒπŸ”’πŸ‘βŒπŸ‡πŸ€›πŸ”˜o a🀜catπŸ‘dogπŸ€›πŸΊβŒβš«πŸ¬b🍺RπŸ”˜βŒπŸ”‘βŒπŸ”’πŸΊβŒπŸ‰πŸΏ0123πŸ†βšͺπŸ”˜AAAπŸ‰'
Success at index 0

I would like to write a ton of unit tests, then look at refactoring it a bit to allow features to be grouped together (rather than having one line in every method). Once I can get that working as intended, I'd like to add more features (like capturing groups) and propose it to be included in the package listing at https://www.emojicode.org/docs/packages/.

I would welcome any feedback anyone has!

A HUGE thanks to "clumsy computer" on YouTube for his live-stream implementation of a regex engine in Python. I gained the understanding, coded along with him in TypeScript, got it working the way I wanted to, then translated it into EmojiCode.

Wrote 271 unit tests, and fixed a handful of bugs! Feeling pretty good about the currently implemented functionality. (I updated the gist with the bug fixes, and the unit tests.)