Drop http://www.w3.org/1999/02/22-rdf-syntax-ns#namedNode ?
RubenVerborgh opened this issue · comments
The URI does not resolve, and it's a long string comparison to be performed. Suggestion: just leave empty.
And for blank, use a single char like _
or so.
Yeah, this basically is a published version of an experimental internal model, I found it disappointing that I couldn't find them on LOV.
We want to have some kind of IRI though (which should indeed resolve), since that enables the parser to be able to simply switch on datatype unconditionally. AFAIK (most) JS engines use interned strings, so the compare should come down to a simple reference compare.
Well, an identical series of characters parsed n times will still occupy n times the space. Interning doesn’t reconcile those. So if it must be a URI (and I don’t see why that would provide an advantage, certainly not memory- or speed-wise), perhaps make it a short URN.
You can still switch unconditionally, just much faster.
I agree that the current URL should be dropped, as it does not resolve.
In most RDF serialization formats, the default datatype for literals is String. If HexTuples would us the NamedNode (or the URI) as the default datatype, it should always serialize strings with xsd:string
. That way, NamedNode Tuple statements do not need an IRI in the datatype.
If it really needs an IRI, I suggest linking to a NamedNode concept in this very spec - it would be a sensible place to resolve to. Or maybe the URI spec?
xsd:anyURI
is a good candidate replacement for rdf:namedNode
No, that’s a URI, not the node named by that URI.
Perhaps I'm misunderstading, but
[s, p, v, dt, l, g]
that’s a URI
rdf:namedNode
is currently in the dt
position which indicates the datatype of v
not the node named by that URI
No that'd be the v
position
So together
[s, p, "schema.org/name", "http://www.w3.org/2001/XMLSchema#anyURI", l, g]
Though rereading that part of a spec more closely, they allow them to be relative, which might pose a problem
There's a difference between
<http://example.org>
"http://example.org"^^<http://www.w3.org/2001/XMLSchema#anyURI>
Both exist and they are not the same.
Even "foo"^^<http://www.w3.org/1999/02/22-rdf-syntax-ns#namedNode>
exists (despite the URI not resolving).
So I relaunch my suggestion of ""
and "_"
, which will make a tremendous performance difference. Bonus: you can then even switch on dt.length
and not on its contents, because no valid URIs of length 0 and 1 exist.
There's a difference between
http://example.org
"http://example.org"^^http://www.w3.org/2001/XMLSchema#anyURI
Hmm, seems bizarre to me. I intuitively figured that, since 'everything is a resource', literals are just a convenient way to point to certain resources which lack a uri space / are too complex for the uri spec. Thinking of the datatype iri as the 'scheme' and the value as the 'path'.
I'll go and rethink some things ;)
Here's a quick performance test: https://gist.github.com/RubenVerborgh/1b70a456230027468a715b54afb59242
On my machine:
- 2.5M triples with full URIs for
dt
: 4.6s - 2.5M triples with single chars for
dt
: 2.8s
literals are just a convenient way to point to certain resources which lack a uri space / are too complex for the uri spec
*Or vice-versa, that uri's are used to determine points in irregular defined spaces
Hmm, seems bizarre to me.
See "http://example.org"^^http://www.w3.org/2001/XMLSchema#anyURI
as a shortcut for
_:x a Literal.
_:x _:value "http://example.org".
_:x _:dataType <http://www.w3.org/2001/XMLSchema#anyURI>.
The syntax in fact hints at this interpretation, with ^
representing reverse path traversal in N3.
Full answer in https://www.w3.org/TR/rdf11-mt/
Closed in https://github.com/ontola/hextuples-parser/releases/tag/v2.0.0
Has been replaced with globalId
and localId
respectively