More Flexible Queries
knapply opened this issue · comments
We can do better than using a single JSON Pointer for query=
.
json <- c(
json1 = '{"a":[1,2,3],"b":[4,5,6]}',
json2 = '{"c":[4,5,6],"d":[7,8,9]}'
)
If you know exactly what values you want to pull into R, you could do the following:
RcppSimdJson::fparse(
json,
query = "a/1",
error_ok = TRUE
)
#> $json1
#> [1] 2
#>
#> $json2
#> NULL
RcppSimdJson::fparse(
json,
query = "b/2",
error_ok = TRUE
)
#> $json1
#> [1] 6
#>
#> $json2
#> NULL
RcppSimdJson::fparse(
json,
query = "c/1",
error_ok = TRUE
)
#> $json1
#> NULL
#>
#> $json2
#> [1] 5
RcppSimdJson::fparse(
json,
query = "d/2",
error_ok = TRUE
)
#> $json1
#> NULL
#>
#> $json2
#> [1] 9
But that's more tedious than just parsing the whole thing followed by so some post-parse-processing (which you'd have to do clean up the NULL
s anyways).
It's a total waste of potential.
We can do this instead:
queries <- list(
query_for_json1 = c("a/1", "b/2"),
query_for_json2 = c("c/1", "d/2")
)
RcppSimdJson::fparse(json, queries)
#> $query_name_for_json1
#> $query_name_for_json1[[1]]
#> [1] 2
#>
#> $query_name_for_json1[[2]]
#> [1] 6
#>
#>
#> $query_name_for_json2
#> $query_name_for_json2[[1]]
#> [1] 5
#>
#> $query_name_for_json2[[2]]
#> [1] 9
This would dramatically increase the amount of work that can be done before any R objects materialize and minimize the amount of post-parse-processing a user might have to do in R (and potentially eliminate it for some sane and stable JSON schemata).
The code hygiene and performance benefits are absolutely worth it.
- Proposed API:
- single queries "recycle" (as they do now: this wouldn't break anyone's code)
- the length of multiple queries must match the length
json=
and are applied in a zip-like fashion - nesting queries inside a list (
ListOf<CharacterVector>
) like the example above provides a way to apply multiple queries to each element- results of named queries carry the query names (also in example)