w3c / qt3tests

Tests for XPath and XQuery

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

fn/matches.re.xml: re00984 unicode-version

zadean opened this issue · comments

Test re00984 tests a large number of code-points for the \w character sequence.
Characters ⌈ and ⌉ are in this list. These codepoints were moved from \p{S} to \p{P} in unicode version 6.3, and therefore out of the \w character sequence.

Perhaps the test should include the "unicode-version" dependency flag for version "6.2"?

@michaelhkay You make a very good point, and a separate test for the reclassified characters is definitely the better answer.

I took a quick look through the notes for the unicode updates since 6.3 and only found a few more category changes, but none that seem to break things in the current test suite as it stands.

Just a side note:
It may also be of interest to "modernize" a bit by adding some of the new emoji/emoticon codepoints to the \p{So} tests (re00169 & re00207). I imagine they will are showing up in real data and adding them would add value to the test cases. Not that this suite is a unicode test-suite, but just a few to show some level of compliance for the newer characters. But that is something for a different issue.