ropensci-archive / mregions

MarineRegions R client

Home Page:https://docs.ropensci.org/mregions

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Search API fails when using spaces in R > 4.0 and Windows 10

salvafern opened this issue · comments

@brittlnv noticed that mr_geo_code() does not work when performing a search with spaces.

I digged in a bit and this seems to be a feature - not a bug - of httr. If you use plus symbols + instead of spaces, the requests work fine as explained in r-lib/httr#335

library(mregions)
#> Warning: package 'mregions' was built under R version 4.0.5

cs <- mr_geo_code("Continental+Shelf")
head(cs)
#>   MRGID gazetteerSource                               placeType  latitude
#> 1 63050            <NA> Continental Shelf (CLCS Recommendation) -43.05362
#> 2 63044            <NA> Continental Shelf (CLCS Recommendation) -13.78329
#> 3 63045            <NA> Continental Shelf (CLCS Recommendation) -36.28064
#> 4 63047            <NA> Continental Shelf (CLCS Recommendation) -58.09272
#> 5 63043            <NA> Continental Shelf (CLCS Recommendation) -33.77674
#> 6 63046            <NA> Continental Shelf (CLCS Recommendation) -48.38320
#>   longitude minLatitude minLongitude maxLatitude maxLongitude precision
#> 1 -56.14215   -47.66977    -60.89235   -37.85991    -50.59635        NA
#> 2 118.06045   -14.68730    117.06350   -13.09524    118.41225        NA
#> 3 129.26419   -37.75001    125.92733   -34.99314    132.74821        NA
#> 4  76.88420   -63.65003     61.80664   -49.50886     89.16208        NA
#> 5 109.76407   -36.84570    107.83853   -30.75226    112.36359        NA
#> 6 149.11460   -50.88755    142.73669   -42.21693    153.71630        NA
#>                                         preferredGazetteerName
#> 1                                Argentinean Continental Shelf
#> 2                   Australian Continental Shelf (Argo region)
#> 3 Australian Continental Shelf (Great Australian Bight region)
#> 4      Australian Continental Shelf (Kerguelen Plateau region)
#> 5    Australian Continental Shelf (Naturaliste Plateau region)
#> 6      Australian Continental Shelf (South Tasman Rise region)
#>   preferredGazetteerNameLang   status accepted
#> 1                    English standard    63050
#> 2                    English standard    63044
#> 3                    English standard    63045
#> 4                    English standard    63047
#> 5                    English standard    63043
#> 6                    English standard    63046

url <- httr::GET("http://marineregions.org/rest/getGazetteerRecordsByName.json/Continental+Shelf/true/false")
httr::status_code(url)
#> [1] 200

cs <- mr_geo_code("Continental Shelf")
#> Error in mr_geo_code("Continental Shelf"): Bad Request (HTTP 400).

url <- httr::GET("http://marineregions.org/rest/getGazetteerRecordsByName.json/Continental Shelf/true/false")
httr::status_code(url)
#> [1] 400

Created on 2022-03-03 by the reprex package (v2.0.0)

We shall consider to make it easier for our users by adding an exception.

It's not only spaces.
An URL can only contain certain characters, so you need to URLencode() the search string
https://www.rdocumentation.org/packages/utils/versions/3.6.2/topics/URLencode
So you would do
term <- URLencode("Continental Shelf")
and then concatenate term in url

Yes I'm aware of that. But for some reason it is still not accepted by httr

term <- URLencode("Continental Shelf")
cs <- mr_geo_code(term)
#> Error in mr_geo_code(term): Bad Request (HTTP 400).

Created on 2022-03-04 by the reprex package (v2.0.0)

I don't think they will change this as you can see in the issue on httr I added before.

Try

term <- URLencode("Continental Shelf", reserved= TRUE)

Nope :(

term <- URLencode("Continental Shelf")
term_reserved <- URLencode("Continental Shelf", reserved= TRUE)

cs <- mr_geo_code(term_reserved)
#> Error in mr_geo_code(term_reserved): Bad Request (HTTP 400).

term == term_reserved
#> [1] TRUE

Created on 2022-03-04 by the reprex package (v2.0.0)

You must be doing something wrong

> term <- URLencode("Continental Shelf")
> term
[1] "Continental%20Shelf"

I know, but still doesn't work in R

term <- URLencode("Continental Shelf")
term
#> [1] "Continental%20Shelf"

url <- paste0("http://marineregions.org/rest/getGazetteerRecordsByName.json/", term, "/true/false")
url
#> [1] "http://marineregions.org/rest/getGazetteerRecordsByName.json/Continental%20Shelf/true/false"

request <- httr::GET(url)
httr::status_code(request)
#> [1] 400

Created on 2022-03-04 by the reprex package (v2.0.0)

It works fine everywhere else, both in a browser or with curl

$ curl -I 'https://marineregions.org/rest/getGazetteerRecordsByName.json/Continental%20Shelf/?like=true&fuzzy=false&offset=0&count=100'
#> % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
#> Dload  Upload   Total   Spent    Left  Speed
#> 0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0HTTP/2 200
#> date: Fri, 04 Mar 2022 11:42:04 GMT
#> server: Apache/2.4.52 (Win64)
#> access-control-allow-origin: *
#>   access-control-allow-headers: X-Requested-With, Content-Type, Accept, Origin, Authorization
#> access-control-allow-methods: GET, POST, OPTIONS
#> content-type: text/html; charset=UTF-8;
#> set-cookie: vliz_webc=vliz_webc1; path=/

Again, you must be doing something wrong :)

> term <- URLencode("Continental Shelf")
> url <- paste0("http://marineregions.org/rest/getGazetteerRecordsByName.json/", term, "/true/false")
> request <- httr::GET(url)
> httr::status_code(request)
[1] 200

Ok I was working locally in a windows laptop with R > 4.0. I ran this now in ubuntu R 3.6 and works fine. Same versions of mregions and httr.

Not sure what is causing this, if the R version, the system, or my laptop lacks some system dependency that I'm unaware of.

Windows

library(mregions)
#> Warning: package 'mregions' was built under R version 4.0.5

sessionInfo()
#> R version 4.0.2 (2020-06-22)
#> Platform: x86_64-w64-mingw32/x64 (64-bit)
#> Running under: Windows 10 x64 (build 19044)
#> 
#> Matrix products: default
#> 
#> locale:
#> [1] LC_COLLATE=English_United States.1252 
#> [2] LC_CTYPE=English_United States.1252   
#> [3] LC_MONETARY=English_United States.1252
#> [4] LC_NUMERIC=C                          
#> [5] LC_TIME=English_United States.1252    
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] httr_1.4.2     mregions_0.1.6
#> 
#> loaded via a namespace (and not attached):
#>  [1] knitr_1.31      magrittr_2.0.1  lattice_0.20-41 rlang_0.4.10   
#>  [5] fastmap_1.1.0   fansi_0.4.2     stringr_1.4.0   styler_1.4.1   
#>  [9] highr_0.8       tools_4.0.2     grid_4.0.2      xfun_0.21      
#> [13] utf8_1.1.4      withr_2.4.1     htmltools_0.5.2 ellipsis_0.3.1 
#> [17] yaml_2.2.1      digest_0.6.27   tibble_3.1.0    lifecycle_1.0.0
#> [21] crayon_1.4.1    purrr_0.3.4     vctrs_0.3.6     fs_1.5.0       
#> [25] glue_1.6.1      evaluate_0.14   rmarkdown_2.11  sp_1.4-5       
#> [29] reprex_2.0.0    stringi_1.5.3   compiler_4.0.2  pillar_1.6.0   
#> [33] backports_1.2.1 pkgconfig_2.0.3

term <- URLencode("Continental Shelf")
url <- paste0("http://marineregions.org/rest/getGazetteerRecordsByName.json/", term, "/true/false")
request <- httr::GET(url)
httr::status_code(request)
#> [1] 400

Created on 2022-03-04 by the reprex package (v2.0.0)

Ubuntu

library(mregions)
  
sessionInfo()
#> R version 3.6.3 (2020-02-29)
#> Platform: x86_64-pc-linux-gnu (64-bit)
#> Running under: Ubuntu 14.04.6 LTS
#> 
#> Matrix products: default
#> BLAS:   /usr/lib/libblas/libblas.so.3.0
#> LAPACK: /usr/lib/lapack/liblapack.so.3.0
#> 
#> locale:
#>   [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
#> [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                 
#> [9] LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
#> 
#> attached base packages:
#>   [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#>   [1] httr_1.4.2     mregions_0.1.6
#> 
#> loaded via a namespace (and not attached):
#>   [1] compiler_3.6.3  pillar_1.6.4    bslib_0.2.4     jquerylib_0.1.3 highr_0.9       tools_3.6.3    
#> [7] digest_0.6.27   jsonlite_1.7.2  evaluate_0.14   lifecycle_1.0.1 tibble_3.1.6    lattice_0.20-41
#> [13] pkgconfig_2.0.3 rlang_0.4.11    reprex_1.0.0    cli_3.1.0       rstudioapi_0.13 curl_4.3.2     
#> [19] yaml_2.2.1      xfun_0.26       fastmap_1.1.0   httr_1.4.2      knitr_1.33      fs_1.5.1       
#> [25] vctrs_0.3.8     sass_0.3.1      grid_3.6.3      glue_1.4.2      R6_2.5.1        processx_3.5.2 
#> [31] fansi_0.5.0     rmarkdown_2.10  sp_1.4-5        callr_3.7.0     clipr_0.6.0     magrittr_2.0.1 
#> [37] ps_1.5.0        ellipsis_0.3.2  htmltools_0.5.2 utf8_1.2.2      crayon_1.4.2  

term <- URLencode("Continental Shelf")
url <- paste0("http://marineregions.org/rest/getGazetteerRecordsByName.json/", term, "/true/false")
request <- httr::GET(url)
httr::status_code(request)
#> [1] 200

I noticed this also happens in worrms -> ropensci/worrms#24
Indeed seems like there are inconsistencies in how curl behaves between OS.

Hi, I will close this issue following rOpenSci guidance on archiving packages

This repository has been archived and it is replaced by mregions2. For further details please see: Why mregions and mregions2?.