locutusjs / locutus

Bringing stdlibs of other programming languages to JavaScript for educational purposes

Home Page:https://locutus.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Unserialize enconding

gartisk opened this issue · comments

Object PHP
'"a:6:{s:8:"filename";s:61:"XPTO - xpto - 3�� etc (08-24-15-10-42-44).jpg";s:8:"mimetype";b:0;s:8:"contents";s:0:"";s:10:"upload_dir";b:1;s:10:"upload_url";s:51:"http://example.example/wp-content/uploads/2015/09/";s:4:"size";i:485758;}"'

When I use unserialize js

Throw:
SyntaxError: Unknown / Unhandled data type(s)

dtype is "p"

I have the same problem, it has troubles procesing UTF8 strings.

I have an utf8_encoded serialized string, so I tried to decode it first but it fails on the same string.

Thanks for reporting, and sorry for the long wait. This project and my motivation were in a bad spot. I recently did a lot of work to breathe new life into it (http://locutus.io/2016/05/announcing-locutus/).

It seems our unserialize function needs some love. @kukawski I noticed you self-assigned this one in #190, did you have plans to attack it?

I found the reason why it doesn't seem to work: The UTF8 length is not computed correctly (I tested it with 'ü').
I tested it with some string and this function that I found and it worked:
http://stackoverflow.com/questions/5515869/string-length-in-bytes-in-javascript#answer-23329386

function utf8Overhead(str) {
  // returns the byte length of an utf8 string
  var s = str.length;
  for (var i=str.length-1; i>=0; i--) {
    var code = str.charCodeAt(i);
    if (code > 0x7f && code <= 0x7ff) s++;
    else if (code > 0x7ff && code <= 0xffff) s+=2;
    if (code >= 0xDC00 && code <= 0xDFFF) i--; //trail surrogate
  }
  return s - 1;
}

I also tried the string above but even PHP couldn't parse it.

I will create a pull request now.

Here's my pull request: #310