inverse function of unraw
milahu opened this issue · comments
what is a fast inverse function of unraw
?
the String.raw
tagged-template only works with literals, not with dynamic strings
do we have something better than this?
var backslashEncode = (function backslashEncodeFactory() {
const escapeCode = Array(32);
for (let i = 0; i < 32; i++)
escapeCode[i] = '\\x'+i.toString(16).padStart(2, '0');
escapeCode[0] = '\\0';
escapeCode[8] = '\\b';
escapeCode[9] = '\\t';
escapeCode[10] = '\\n';
//escapeCode[10] = '\n'; // dont escape newline
escapeCode[11] = '\\v';
escapeCode[12] = '\\f';
escapeCode[13] = '\\r';
return function backslashEncode(str) {
let res = '';
for (let i = 0; i < str.length; i++) {
const char16bit = str[i];
const code = char16bit.charCodeAt(0);
res += (
(code < 32) ? escapeCode[code] : // ascii control
(code == 92) ? '\\\\' :
(code < 128) ? char16bit : // ascii printable
'\\u'+code.toString(16).padStart(4, '0') // unicode
);
}
return res;
}
})();
fails
String.raw`\u5A5A` == '\\u5A5A'
String.raw({ raw: ['\u5A5A'] }) == '婚'
String.raw({ raw: ['', ''] }, '\u5A5A') == '婚'
JSON.stringify("\u5A5A") == '"婚"'
edit: here is another library with encode + decode fns: Shakeskeyboarde/slashes
but that one fails to encode unicode chars
A raw
function that would work on dynamic strings would necessarily have different behavior from the built-in String.raw
because there's no way for JS to distinguish between the string " "
and the string "\t"
after the string is created. Those become the same string (" " === "\t"
is true
).
For example (with some theoretical raw
function):
const a = "\t";
const b = " ";
raw(a); // "\t"
raw(b); // "\t" even though the string was not made using \t, because there's no way to tell the two apart
// On the other hand:
String.raw`\t`; // "\t"
String.raw` `; // " "
So this is not really in the scope of this particular project because this project intends only to invert the String.raw
function.
I guess what you are really looking for is a function that will take a UTF-16 string and break it down into the ASCII representation using backslashes. Maybe he
satisfies this? You can use the allowUnsafeSymbols
to disable encoding of basic symbols like &
.
hmm. i thought unraw
is just a backslashDecode
function
identical to this stripSlashes function, or PHP's stripslashes function
he
looks useful, but i dont see any backslashEncode
function there ...?
i want to encode strings to their "ascii javascript literal string" representation, similar to JSON.stringify
JSON.stringify("\0 \b \t \n \v \f \r \ud83d\ude0a \" '")
== `"\\u0000 \\b \\t \\n \\u000b \\f \\r 😊 \\" '"`
... but json strings aint ascii, and doublequotes are escaped
so to answer my question
do we have something better than this?
nope
The problem with string encoding/decoding is there's just so many different use cases and standards. I'm sure what you're looking for is out there, but knowing how to find it is really difficult.