as_rdf needs to escape certain characters
josephguillaume opened this issue · comments
as_rdf.data.frame
, write_nquads
, normalize_table
or poor_mans_nquads
need to escape certain characters otherwise rdf_parse
and therefore as_rdf
either returns an error or no content.
The characters to be escaped include at least double quotes in string literals and spaces in predicates.
Examples follow below. There are obvious solutions to these particular cases, but it's not clear to me what would be needed for the solutions to be generally applicable and not cause any regressions.
A multiple word predicate silently fails to return any triples unless it is URLencoded
df <- data.frame(1,1)
names(df) <- c("id","multiple word predicate")
g <- as_rdf(df,prefix = "http://example.org#",key_column = "id")
g
# Total of 0 triples, stored in hashes
ntab<-rdflib:::normalize_table(df,key_column = "id")
ntab$predicate<-sapply(ntab$predicate,URLencode)
rdflib:::poor_mans_nquads(ntab,"temp.nquads",prefix="http://example.org#")
g<-rdf_parse("temp.nquads",format="nquads")
g
# Total of 1 triples, stored in hashes
#-------------------------------
# <http://example.org#1> <http://example.org#multiple%20word%20predicate> "1"^^<http://www.w3.org/2001/XMLSchema#decimal> .
# As a workaround, a user can URLencode the data.frame names
df <- data.frame(1,1)
names(df) <- c("id","multiple word predicate")
names(df) <- sapply(names(df),URLencode)
g <- as_rdf(df,prefix = "http://example.org#",key_column = "id")
g
A string with double quotes silently fails to return any triples unless a backslash escape character is added (which itself needs to be escaped in R)
df <- data.frame(1,'string with "quotes"')
names(df) <- c("id","predicate")
g <- as_rdf(df,prefix = "http://example.org#",key_column = "id")
g
# Total of 0 triples, stored in hashes
ntab<-rdflib:::normalize_table(df,key_column = "id")
ntab$object<-gsub('"','\\"',ntab$object,fixed=T)
rdflib:::poor_mans_nquads(ntab,"temp.nquads",prefix="http://example.org#")
g<-rdf_parse("temp.nquads",format="nquads")
g
# Total of 1 triples, stored in hashes
# -------------------------------
# <http://example.org#1> <http://example.org#predicate> "string with "quotes""^^<http://www.w3.org/2001/XMLSchema#string> .
# As a workaround, a user can replace quotes within the relevant columns of the data.frame
df <- data.frame(1,'string with "quotes"')
names(df) <- c("id","predicate")
df$predicate <- gsub('"','\\"',df$predicate,fixed=T)
g <- as_rdf(df,prefix = "http://example.org#",key_column = "id")
g