Support Unicode JavaScript Source for JSExtension
GoogleCodeExporter opened this issue · comments
What steps will reproduce the problem?
>>> import PyV8
>>> PyV8.JSExtension('test', u';')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
Boost.Python.ArgumentError: Python argument types in
JSExtension.__init__(JSExtension, str, unicode)
did not match C++ signature:
__init__(_object*, std::string name, std::string source, boost::python::api::object callback=None, boost::python::list dependencies=[], bool register=True)
What is the expected output? What do you see instead?
Support similar to JSContext:
>>> with PyV8.JSContext() as ctx: ctx.eval(u";")
...
>>>
What version of the product are you using? On what operating system?
python-pyv8 (1.0-~svn470+13384)
Linux zunca 3.5.0-22-generic #34-Ubuntu SMP Tue Jan 8 21:47:00 UTC 2013 x86_64
x86_64 x86_64 GNU/Linux
Perhaps this is a duplicate for #75
Original issue reported on code.google.com by e.generalov
on 18 Jan 2013 at 3:59
[deleted comment]
Original comment by flier...@gmail.com
on 19 Jan 2013 at 1:25
- Changed state: Accepted
- Added labels: OpSys-All
I have added JSExtension with unicode name and source after the SVN trunk r472,
but v8 doesn't support the extension other than ASCII, it means even pyv8 could
use unicode name and source, but v8 will failed if you pass a real unicode
source other than ASCII :(
Please submit an issue to the v8 project if you real need it support Unicode
extension.
Thanks
Original comment by flier...@gmail.com
on 19 Jan 2013 at 2:51
- Changed state: Fixed
As workaround we could to escape unicode symbols with JavaScript escape
sequences, before passing to JSExtension.
ext = JSExtension(name, js_escape_unicode(jsource))
import re
ESCAPABLE = re.compile(r'([^\x00-\x7f])')
HAS_UTF8 = re.compile(r'[\x80-\xff]')
def _js_escape_unicode_re_callack(match):
s = match.group(0)
n = ord(s)
if n < 0x10000:
return r'\u%04x' % (n,)
else:
# surrogate pair
n -= 0x10000
s1 = 0xd800 | ((n >> 10) & 0x3ff)
s2 = 0xdc00 | (n & 0x3ff)
return r'\u%04x\u%04x' % (s1, s2)
def js_escape_unicode(s):
"""Return an ASCII-only representation of a JavaScript string"""
if isinstance(s, str):
if HAS_UTF8.search(s) is None:
return s
s = s.decode('utf-8')
return str(ESCAPABLE.sub(_js_escape_unicode_re_callack, s))
Original comment by e.generalov
on 21 Jan 2013 at 6:36
Thanks for your patch, I have moved unicode support from C++ to Python side,
please verify it with SVN trunk code after r477
Original comment by flier...@gmail.com
on 2 Feb 2013 at 2:03