Search API fails when using spaces in R > 4.0 and Windows 10
salvafern opened this issue · comments
@brittlnv noticed that mr_geo_code()
does not work when performing a search with spaces.
I digged in a bit and this seems to be a feature - not a bug - of httr
. If you use plus symbols +
instead of spaces, the requests work fine as explained in r-lib/httr#335
library(mregions)
#> Warning: package 'mregions' was built under R version 4.0.5
cs <- mr_geo_code("Continental+Shelf")
head(cs)
#> MRGID gazetteerSource placeType latitude
#> 1 63050 <NA> Continental Shelf (CLCS Recommendation) -43.05362
#> 2 63044 <NA> Continental Shelf (CLCS Recommendation) -13.78329
#> 3 63045 <NA> Continental Shelf (CLCS Recommendation) -36.28064
#> 4 63047 <NA> Continental Shelf (CLCS Recommendation) -58.09272
#> 5 63043 <NA> Continental Shelf (CLCS Recommendation) -33.77674
#> 6 63046 <NA> Continental Shelf (CLCS Recommendation) -48.38320
#> longitude minLatitude minLongitude maxLatitude maxLongitude precision
#> 1 -56.14215 -47.66977 -60.89235 -37.85991 -50.59635 NA
#> 2 118.06045 -14.68730 117.06350 -13.09524 118.41225 NA
#> 3 129.26419 -37.75001 125.92733 -34.99314 132.74821 NA
#> 4 76.88420 -63.65003 61.80664 -49.50886 89.16208 NA
#> 5 109.76407 -36.84570 107.83853 -30.75226 112.36359 NA
#> 6 149.11460 -50.88755 142.73669 -42.21693 153.71630 NA
#> preferredGazetteerName
#> 1 Argentinean Continental Shelf
#> 2 Australian Continental Shelf (Argo region)
#> 3 Australian Continental Shelf (Great Australian Bight region)
#> 4 Australian Continental Shelf (Kerguelen Plateau region)
#> 5 Australian Continental Shelf (Naturaliste Plateau region)
#> 6 Australian Continental Shelf (South Tasman Rise region)
#> preferredGazetteerNameLang status accepted
#> 1 English standard 63050
#> 2 English standard 63044
#> 3 English standard 63045
#> 4 English standard 63047
#> 5 English standard 63043
#> 6 English standard 63046
url <- httr::GET("http://marineregions.org/rest/getGazetteerRecordsByName.json/Continental+Shelf/true/false")
httr::status_code(url)
#> [1] 200
cs <- mr_geo_code("Continental Shelf")
#> Error in mr_geo_code("Continental Shelf"): Bad Request (HTTP 400).
url <- httr::GET("http://marineregions.org/rest/getGazetteerRecordsByName.json/Continental Shelf/true/false")
httr::status_code(url)
#> [1] 400
Created on 2022-03-03 by the reprex package (v2.0.0)
We shall consider to make it easier for our users by adding an exception.
It's not only spaces.
An URL can only contain certain characters, so you need to URLencode() the search string
https://www.rdocumentation.org/packages/utils/versions/3.6.2/topics/URLencode
So you would do
term <- URLencode("Continental Shelf")
and then concatenate term in url
Yes I'm aware of that. But for some reason it is still not accepted by httr
term <- URLencode("Continental Shelf")
cs <- mr_geo_code(term)
#> Error in mr_geo_code(term): Bad Request (HTTP 400).
Created on 2022-03-04 by the reprex package (v2.0.0)
I don't think they will change this as you can see in the issue on httr
I added before.
Try
term <- URLencode("Continental Shelf", reserved= TRUE)
Nope :(
term <- URLencode("Continental Shelf")
term_reserved <- URLencode("Continental Shelf", reserved= TRUE)
cs <- mr_geo_code(term_reserved)
#> Error in mr_geo_code(term_reserved): Bad Request (HTTP 400).
term == term_reserved
#> [1] TRUE
Created on 2022-03-04 by the reprex package (v2.0.0)
You must be doing something wrong
> term <- URLencode("Continental Shelf")
> term
[1] "Continental%20Shelf"
I know, but still doesn't work in R
term <- URLencode("Continental Shelf")
term
#> [1] "Continental%20Shelf"
url <- paste0("http://marineregions.org/rest/getGazetteerRecordsByName.json/", term, "/true/false")
url
#> [1] "http://marineregions.org/rest/getGazetteerRecordsByName.json/Continental%20Shelf/true/false"
request <- httr::GET(url)
httr::status_code(request)
#> [1] 400
Created on 2022-03-04 by the reprex package (v2.0.0)
It works fine everywhere else, both in a browser or with curl
$ curl -I 'https://marineregions.org/rest/getGazetteerRecordsByName.json/Continental%20Shelf/?like=true&fuzzy=false&offset=0&count=100'
#> % Total % Received % Xferd Average Speed Time Time Time Current
#> Dload Upload Total Spent Left Speed
#> 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0HTTP/2 200
#> date: Fri, 04 Mar 2022 11:42:04 GMT
#> server: Apache/2.4.52 (Win64)
#> access-control-allow-origin: *
#> access-control-allow-headers: X-Requested-With, Content-Type, Accept, Origin, Authorization
#> access-control-allow-methods: GET, POST, OPTIONS
#> content-type: text/html; charset=UTF-8;
#> set-cookie: vliz_webc=vliz_webc1; path=/
Again, you must be doing something wrong :)
> term <- URLencode("Continental Shelf")
> url <- paste0("http://marineregions.org/rest/getGazetteerRecordsByName.json/", term, "/true/false")
> request <- httr::GET(url)
> httr::status_code(request)
[1] 200
Ok I was working locally in a windows laptop with R > 4.0. I ran this now in ubuntu R 3.6 and works fine. Same versions of mregions and httr.
Not sure what is causing this, if the R version, the system, or my laptop lacks some system dependency that I'm unaware of.
Windows
library(mregions)
#> Warning: package 'mregions' was built under R version 4.0.5
sessionInfo()
#> R version 4.0.2 (2020-06-22)
#> Platform: x86_64-w64-mingw32/x64 (64-bit)
#> Running under: Windows 10 x64 (build 19044)
#>
#> Matrix products: default
#>
#> locale:
#> [1] LC_COLLATE=English_United States.1252
#> [2] LC_CTYPE=English_United States.1252
#> [3] LC_MONETARY=English_United States.1252
#> [4] LC_NUMERIC=C
#> [5] LC_TIME=English_United States.1252
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] httr_1.4.2 mregions_0.1.6
#>
#> loaded via a namespace (and not attached):
#> [1] knitr_1.31 magrittr_2.0.1 lattice_0.20-41 rlang_0.4.10
#> [5] fastmap_1.1.0 fansi_0.4.2 stringr_1.4.0 styler_1.4.1
#> [9] highr_0.8 tools_4.0.2 grid_4.0.2 xfun_0.21
#> [13] utf8_1.1.4 withr_2.4.1 htmltools_0.5.2 ellipsis_0.3.1
#> [17] yaml_2.2.1 digest_0.6.27 tibble_3.1.0 lifecycle_1.0.0
#> [21] crayon_1.4.1 purrr_0.3.4 vctrs_0.3.6 fs_1.5.0
#> [25] glue_1.6.1 evaluate_0.14 rmarkdown_2.11 sp_1.4-5
#> [29] reprex_2.0.0 stringi_1.5.3 compiler_4.0.2 pillar_1.6.0
#> [33] backports_1.2.1 pkgconfig_2.0.3
term <- URLencode("Continental Shelf")
url <- paste0("http://marineregions.org/rest/getGazetteerRecordsByName.json/", term, "/true/false")
request <- httr::GET(url)
httr::status_code(request)
#> [1] 400
Created on 2022-03-04 by the reprex package (v2.0.0)
Ubuntu
library(mregions)
sessionInfo()
#> R version 3.6.3 (2020-02-29)
#> Platform: x86_64-pc-linux-gnu (64-bit)
#> Running under: Ubuntu 14.04.6 LTS
#>
#> Matrix products: default
#> BLAS: /usr/lib/libblas/libblas.so.3.0
#> LAPACK: /usr/lib/lapack/liblapack.so.3.0
#>
#> locale:
#> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
#> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C
#> [9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] httr_1.4.2 mregions_0.1.6
#>
#> loaded via a namespace (and not attached):
#> [1] compiler_3.6.3 pillar_1.6.4 bslib_0.2.4 jquerylib_0.1.3 highr_0.9 tools_3.6.3
#> [7] digest_0.6.27 jsonlite_1.7.2 evaluate_0.14 lifecycle_1.0.1 tibble_3.1.6 lattice_0.20-41
#> [13] pkgconfig_2.0.3 rlang_0.4.11 reprex_1.0.0 cli_3.1.0 rstudioapi_0.13 curl_4.3.2
#> [19] yaml_2.2.1 xfun_0.26 fastmap_1.1.0 httr_1.4.2 knitr_1.33 fs_1.5.1
#> [25] vctrs_0.3.8 sass_0.3.1 grid_3.6.3 glue_1.4.2 R6_2.5.1 processx_3.5.2
#> [31] fansi_0.5.0 rmarkdown_2.10 sp_1.4-5 callr_3.7.0 clipr_0.6.0 magrittr_2.0.1
#> [37] ps_1.5.0 ellipsis_0.3.2 htmltools_0.5.2 utf8_1.2.2 crayon_1.4.2
term <- URLencode("Continental Shelf")
url <- paste0("http://marineregions.org/rest/getGazetteerRecordsByName.json/", term, "/true/false")
request <- httr::GET(url)
httr::status_code(request)
#> [1] 200
I noticed this also happens in worrms
-> ropensci/worrms#24
Indeed seems like there are inconsistencies in how curl
behaves between OS.
Hi, I will close this issue following rOpenSci guidance on archiving packages
This repository has been archived and it is replaced by mregions2. For further details please see: Why mregions and mregions2?.