CLARIAH / grlc

grlc builds Web APIs using shared SPARQL queries

Home Page:http://grlc.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Support for multivalue parameters in request

jaw111 opened this issue · comments

Some users are asking if it is possible to add support for looking up multiple values for a parameter in a single request.

For example, given the query:

PREFIX dbo: <http://dbpedia.org/ontology/>
PREFIX dbp: <http://dbpedia.org/property/>
PREFIX schema: <http://schema.org/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

SELECT ?band ?album WHERE {
  ?band rdf:type dbo:Band .
  ?album rdf:type schema:MusicAlbum .
  ?band dbo:genre ?_genre_iri .
  ?album dbp:artist ?band .
} LIMIT 100

The user wants to be able to lookup the bands and albums for multiple genres in one request, which can be achieved fairly easily in SPARQL using the VALUES clause:

PREFIX dbo: <http://dbpedia.org/ontology/>
PREFIX dbp: <http://dbpedia.org/property/>
PREFIX schema: <http://schema.org/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

SELECT ?band ?album WHERE {
  ?band rdf:type dbo:Band .
  ?album rdf:type schema:MusicAlbum .
  ?band dbo:genre ?_genre_iri .
  ?album dbp:artist ?band .
} LIMIT 100
VALUES ?_genre_iri {
  <http://dbpedia.org/resource/Punk_music>
  <http://dbpedia.org/resource/Dance_music>
}

Would it be possible to extend support such that user can make such requests by simply adding more parameter-value pairs on the request URL:

curl "http://grlc.io/api-git/CLARIAH/grlc-queries/defaults?genre=http%3A%2F%2Fdbpedia.org%2Fresource%2FPunk_music&genre=http%3A%2F%2Fdbpedia.org%2Fresource%2FRock_music" -H "accept: text/csv"

Alternatively using POST method:

curl "http://grlc.io/api-git/CLARIAH/grlc-queries/defaults" \
  --data-urlencode "genre=http://dbpedia.org/resource/Punk_music" \
  --data-urlencode "genre=http://dbpedia.org/resource/Dance_music" \
  -H "accept: text/csv"

Sending a JSON body might be another approach:

curl "http://grlc.io/api-git/CLARIAH/grlc-queries/defaults" \
  --data-binary '{"genre":["http://dbpedia.org/resource/Punk_music","http://dbpedia.org/resource/Dance_music"]}'  \
  -H "content-type: application/json" \
  -H "accept: text/csv"

Hi @jaw111 -- thanks for suggesting this feature. It seems like it could be useful, but I have some doubts about it.

Correct me if I'm wrong, but I think the VALUES clause needs to be part of the WHERE clause. If that is correct, that is where my doubt begins: adding the VALUES clause inside the WHERE clause would modify the query too much. grlc does not modify the queries when generating the API -- the query on your GitHub repo is the query that gets executed (except for the variable replacement part, of course). I see this as being an important part of transparency. I am wondering if there is an another way of achieving this without it modifying the query beyond that is necessary.

@albertmeronyo -- what do you think?

@c-martinez I agree, VALUES should be in the WHERE clause.

I personally would find this feature very useful. (I use the VALUES in my app to request a list of data from Wikidata)

Here's one possible implementation. The user would be required to write:

SELECT ?band ?album WHERE {
  ?band rdf:type dbo:Band .
  ?album rdf:type schema:MusicAlbum .
  ?band dbo:genre ?_genre_iri .
  ?album dbp:artist ?band .
  VALUES ?_genre_iri {
    ?_genre_multival_iri
  }
} LIMIT 100

To indicate they know that it's a VALUES clause and so grlc doesn't have to add that/modify the query too much.
grlc just has to

  1. Check that ?_genre_iri is in the query at least twice (the variable and the VALUES claue)
  2. Replace ?_genre_multival_iri with the multiple values, adding < if they are IRIs and quotes if they are string values.

Of course these could be named different things. Let me know what you think.
Thanks!

@jaw111 Hey John, my mistake. The SPARQL spec does have the following ABNF:

Query ::= Prologue ( SelectQuery | ConstructQuery | DescribeQuery | AskQuery ) ValuesClause

so your example is perfectly valid (I just guess all the examples I've seen haven't used it there).

I found a hack that allows for multi-value placeholders. See here: https://github.com/knowledgepixels/grlc-test/blob/main/multi-value.rq

It allows for up to 100 values in the multi-value placeholder, but you might run into this issue if the query gets too large: #445

Hi @jaw111 ! I've written a short tutorial (feedback welcome) based on the hack described by @tkuhn.

I think this fixes your problem, so I am closing this issue (but please feel free to re-open if this doesn't work)

(Side note -- this may not work straight away on the grlc.io server, until the next release)