Parsing json variable in <script> leaves $domReplaceHelper values
ruud-altenburg opened this issue · comments
What is this feature about (expected vs actual behaviour)?
I'm parsing a <script> variable containing json. Characters in $domReplaceHelper apparently are replaced when the page is parsed but not restored when the data is returned.
How can I reproduce it?
See example.txt for a real world script plus my (slightly crude) code to extract the data.
Does it take minutes, hours or days to fix?
I suppose minutes.
Any additional information?
Forgot to add that the example returns "RBR Holt 00626 SIMPLE_HTML_DOM__VOKU__AMP RBR Holt 00732".
here I added a test-case for your problem: 2e65479#diff-f9e35e3ee28495a595a36e0f7a4ae154R1454
The main problem here is that we need to use special internal encoding, to keep the input encoding, but we need to decode this internal encoding via HtmlDomParser->fixHtmlOutput()