Comments containing > character.
emkw opened this issue · comments
Regarding interoperability with XML 1.0 and 1.1 for XML5 - comments containing >
chars in the middle are valid in XML 1.0 and 1.1. So the change introduced in 2caadd3 may cause break on those.
This example is in both XML 1.0 and 1.1 spec, as a valid comment:
<!-- declarations for <head> & <body> -->
Was error on >
character in comment introduced to overcome some class of bugs?
Hm a valid point. I based XML 5.0 to be compatible with HTML spec. However this seems like a case where it might be better to be more compliant with XML spec.
Anyway, I will be off on vacation, so I'll see updating this, tests and parser.
@annevk got any ideas/comments about this issue?
I think we should not be incompatible with XML. We can only "innovate" when the XML would not be well-formed anyway.
Well I looked into it, and while it's easy to remove that rule, it changes how we parse errors in comments. E.g.
<!-->
becomes ParseErr, Comment(">")
<!-->test
becomes ParseErr, Comment(">test")
while
<!>
remains ParseErr, Comment("")
One solution to this "problem" is to recognize that after <!--
any >
ends comment, but then comments like <!-->>Text -->
aren't valid.
To get back to the OP’s question:
This example is in both XML 1.0 and 1.1 spec, as a valid comment:
<!-- declarations for <head> & <body> -->
Was error on
>
character in comment introduced to overcome some class of bugs?
About the following response:
Hm a valid point. I based XML 5.0 to be compatible with HTML spec. However this seems like a case where it might be better to be more compliant with XML spec.
The comment <!-- declarations for <head> & <body> -->
is actually also a conforming comment per the HTML spec. As far as I understand, it has always been a conforming HTML comment. So this isn’t a case where the requirements in XML are different from HTML, right?
As far as I understand, it has always been a conforming HTML comment. So this isn’t a case where the requirements in XML are different from HTML, right?
Unsure. It seems like the spec changed somewhat in meantime (I remember CommentDash
being in spec and now its not?!) or my mind is playing tricks at me. I'll add new comment states and see how it goes. This would be ideal solution.
It seems like the spec changed somewhat in meantime (I remember
CommentDash
being in spec and now its not?!) or my mind is playing tricks at me.
Yeah, two changes made recently:
But neither of those relates to the <!-- declarations for <head> & <body> -->
case (and incidentally neither changes how those cases are actually parsed; instead both only affect the optional “parse error”-reporting behavior).
Yeah. This is a big error either way. It's both non XML compliant and different from the way HTML5 does things.
The patch is on my machine, and I'll update it later this evening.
Note that XML 1.0 allows: <!-- comment with > and ending with <!-->
.
It’s only --
that’s not allowed in XML 1.0 comments