mischov / meeseeks

An Elixir library for parsing and extracting data from HTML and XML with CSS or XPath selectors.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Attribute values that contain double quotes incorrectly render to HTML

mischov opened this issue · comments

Problem

As described in this issue, single quoted attributes containing double quotes are rendered by Meeseeks.html incorrectly, resulting in problems when re-parsing, etc.

iex> import Meeseeks.CSS
Meeseeks.CSS

iex> html1 = "<tag attr='one \"two\" three'>"
"<tag attr='one \"two\" three'>"

iex> html2 = Meeseeks.one(html1, css("tag")) |> Meeseeks.html()
"<tag attr=\"one \"two\" three\"></tag>"

iex> Meeseeks.one(html2, css("tag")) |> Meeseeks.html()
"<tag attr=\"one \" two\"=\"\" three\"=\"\"></tag>"

Solution

Implement the fix proposed by @grych, specifically, when rendering to HTML, check if attribute values contain double quotes and surround with single quotes instead of double if it does.

Fixed in 09a1cb5.

Released in v0.9.0.