roc-lang / roc

A fast, friendly, functional language.

Home Page:https://roc-lang.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Print string literals with escape sequence

basile-henry opened this issue · comments

The current strategy for printing string literals (in the REPL for example), seems to be: print the string, add some quotes around it (triple quotes when the need for multi-line is detected)

The problem is that it ends up creating invalid string literals, or at least string literals that don't correspond to the original string when it happens to be valid.

Concretely this is incorrect:

» "Problematic non-escaped prints: \r, \", \\, \$(not_interpolated)"

"""
, ", \, $(not_interpolated)nts:
""" : Str

I believe all the escaped characters should be printed as they are written in the source code when printing a string literal (not when printing the content of the string of course).
There might be some exceptions when there are multiple ways to write a character:

  • \n: Since Roc has multi-line string literals, it is perfectly fine to default to using that instead of printing with the escape sequence
  • \u(<num>): Since Roc source code allows UTF-8 characters in string literals, it's should be fine to print them as is (maybe with some caveats, for example if they represent \, ", or some ASCII control characters that wouldn't/shouldn't be allowed in string literals (like the \r char as opposed to its escape-sequence version)

There is probably an open question for \t which seems to be valid in string literals in its non-escape-sequence form, which I am personally not a fan of, as it's seems error prone.

Another fun one in the REPL:

» "\"\"\""

"""
"""
""" : Str