ord returns different results in JS than PHP
ian opened this issue · comments
- Have you checked the guidelines in our [Contributing]
Description
On the PHP docs the ord
function returns an int between 0-255. However, I'm seeing integer values out of 65536.
Here's PHP's output munging a binary string:
code: log_message('error', "'".$data[$i]."' -> '".ord($data[$i]));
ERROR - 2020-01-08 16:53:51 --> data 73="�˧ۥ���u
ERROR - 2020-01-08 16:53:51 --> '7' -> '55
ERROR - 2020-01-08 16:53:51 --> '3' -> '51
ERROR - 2020-01-08 16:53:51 --> '=' -> '61
ERROR - 2020-01-08 16:53:51 --> '' -> '15
ERROR - 2020-01-08 16:53:51 --> '"' -> '34
ERROR - 2020-01-08 16:53:51 --> '�' -> '156
ERROR - 2020-01-08 16:53:51 --> '�' -> '203
ERROR - 2020-01-08 16:53:51 --> '�' -> '167
ERROR - 2020-01-08 16:53:51 --> '�' -> '219
ERROR - 2020-01-08 16:53:51 --> '�' -> '165
ERROR - 2020-01-08 16:53:51 --> '�' -> '156
ERROR - 2020-01-08 16:53:51 --> '�' -> '179
ERROR - 2020-01-08 16:53:51 --> '�' -> '149
ERROR - 2020-01-08 16:53:51 --> 'u' -> '117
ERROR - 2020-01-08 16:53:51 --> '
' -> '10
ERROR - 2020-01-08 16:53:51 --> '�' -> '173
here's JS:
const strings = require("locutus/php/strings")
console.log('${data[i]}' -> ${strings.ord(data[i])}
)
console.log lib/crypto.js:122
data 73="�˧ۥ���u
console.log lib/crypto.js:149
'7' -> 55
console.log lib/crypto.js:149
'3' -> 51
console.log lib/crypto.js:149
'=' -> 61
console.log lib/crypto.js:149
'' -> 15
console.log lib/crypto.js:149
'"' -> 34
console.log lib/crypto.js:149
'�' -> 65533
console.log lib/crypto.js:149
'˧' -> 743
console.log lib/crypto.js:149
'ۥ' -> 1765
console.log lib/crypto.js:149
'�' -> 65533
console.log lib/crypto.js:149
'�' -> 65533
console.log lib/crypto.js:149
'�' -> 65533
console.log lib/crypto.js:149
'u' -> 117
console.log lib/crypto.js:149
'
' -> 10
@kvz any thoughts on what I might be doing wrong? Would appreciate any help or insight you might have.
Hi @ian! ord in JavaScript is an inherently flawed concept. PHP's strings are series of 8-bit bytes and are therefore suitable for both binary data and text (using encodings such as UTF-8). JavaScript's strings are based on 16-bit UTF-16 code units and were not designed for binary data, only for text.
There are multiple ways to store binary data in JS strings and none are good: you either waste memory or get an inconvenient way of accessing individual bytes. You can limit yourself to using only 8 bits per 16-bit element (so two bytes with values 1 and 2 would be "\u0001\u0002" - Locutus's ord handles this correctly) or you can pack two bytes together ("\u0102" or "\u0201" - Locutus's ord is not built for this). They're both valid choices.
To answer your question of what you're doing wrong, it's probably that you're storing binary data in JavaScript strings. They're just not made for it. I recommend looking into using Uint8Array instead.