Ygg01 / xml5_draft

Draft for the XML5 proposal.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Comments containing > character.

emkw opened this issue · comments

commented

Regarding interoperability with XML 1.0 and 1.1 for XML5 - comments containing > chars in the middle are valid in XML 1.0 and 1.1. So the change introduced in 2caadd3 may cause break on those.

This example is in both XML 1.0 and 1.1 spec, as a valid comment:

<!-- declarations for <head> & <body> -->

Was error on > character in comment introduced to overcome some class of bugs?

commented

Hm a valid point. I based XML 5.0 to be compatible with HTML spec. However this seems like a case where it might be better to be more compliant with XML spec.

Anyway, I will be off on vacation, so I'll see updating this, tests and parser.

@annevk got any ideas/comments about this issue?

I think we should not be incompatible with XML. We can only "innovate" when the XML would not be well-formed anyway.

commented

Well I looked into it, and while it's easy to remove that rule, it changes how we parse errors in comments. E.g.

<!--> becomes ParseErr, Comment(">")
<!-->test becomes ParseErr, Comment(">test")

while

<!> remains ParseErr, Comment("")

One solution to this "problem" is to recognize that after <!-- any > ends comment, but then comments like <!-->>Text --> aren't valid.

To get back to the OP’s question:

This example is in both XML 1.0 and 1.1 spec, as a valid comment:

<!-- declarations for <head> & <body> -->

Was error on > character in comment introduced to overcome some class of bugs?

About the following response:

Hm a valid point. I based XML 5.0 to be compatible with HTML spec. However this seems like a case where it might be better to be more compliant with XML spec.

The comment <!-- declarations for <head> & <body> --> is actually also a conforming comment per the HTML spec. As far as I understand, it has always been a conforming HTML comment. So this isn’t a case where the requirements in XML are different from HTML, right?

commented

As far as I understand, it has always been a conforming HTML comment. So this isn’t a case where the requirements in XML are different from HTML, right?

Unsure. It seems like the spec changed somewhat in meantime (I remember CommentDash being in spec and now its not?!) or my mind is playing tricks at me. I'll add new comment states and see how it goes. This would be ideal solution.

It seems like the spec changed somewhat in meantime (I remember CommentDash being in spec and now its not?!) or my mind is playing tricks at me.

Yeah, two changes made recently:

But neither of those relates to the <!-- declarations for <head> & <body> --> case (and incidentally neither changes how those cases are actually parsed; instead both only affect the optional “parse error”-reporting behavior).

commented

Yeah. This is a big error either way. It's both non XML compliant and different from the way HTML5 does things.

The patch is on my machine, and I'll update it later this evening.

Note that XML 1.0 allows: <!-- comment with > and ending with <!-->.

It’s only -- that’s not allowed in XML 1.0 comments