w3c / N3

W3C's Notation 3 (N3) Community Group

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Be more specific about required regex functionality of string:matches, string:notMatches, and string:scrape

rybesh opened this issue · comments

Currently the documentation for string:matches, string:notMatches, and string:scrape currently state that a regex argument is “a regular expression in the perl, python style.” However, this is not specific enough to ensure that one can write portable N3 rules that use these built-ins. There should be a specific list of regex features that implementation built-ins MUST support (even if some implementations may support features beyond this guaranteed set).

(This issue is inspired by the recent changes to eye which broke uses of these built-ins that expected support of Perl Compatible Regular Expressions.)

@rybesh I think that "Perl-style" regular expressions are the closest thing to a regex standard that is used in practice.
But, as you point out, the issue is that N3 implementations will simply delegate to whatever is available in the host language (prolog, C library, java, ...) as this is beyond the scope of its reasoning task. E.g., there are many features missing in java regex, but jen3 simply delegates to that.

What "core" regex features would you enforce (as a kind of lowest common denominator)?

I would suggest that it be specified to match the SPARQL REGEX function, i.e. to delegate to the XPath fn:matches function specification.

Presumably languages in which N3 tools are being implemented also have SPARQL tools, so there ought to be opportunities for code reuse. And even if not, at least it gives a solid (but maybe not LCD) definition of what to expect a regex function to handle.