jessecambon / tidygeocoder

Geocoding Made Easy

Home Page:https://jessecambon.github.io/tidygeocoder

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Incompatible datatype error with OpenCage query - 2

jonathancallahan opened this issue · comments

Same issue as previously but now with component.building:

> openCageTbl <-
+   tidygeocoder::reverse_geocode(
+     .tbl = dplyr::tibble(
+       lat = c(36.609014, 34.735150),
+       long = c(-121.829122, -120.532900)
+     ),
+     lat = "lat",
+     long = "long",
+     full_results = TRUE,
+     method = "opencage"
+   )
...
Error: Can't combine `..1$components.building` <character> and `..2$components.building` <integer>.
Run `rlang::last_error()` to see where the error occurred.

Below is my session info. Is it possible that the issue is related to using R 4.1.1 rather than 4.1.2?

> library(dplyr)
...
> library(httr)
> library(jsonlite)
Warning message:
package ‘jsonlite’ was built under R version 4.1.2 > devtools::session_info()
─ Session info ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
 setting  value                       
 version  R version 4.1.1 (2021-08-10)
 os       macOS Monterey 12.1         
 system   x86_64, darwin17.0          
 ui       RStudio                     
 language (EN)                        
 collate  en_US.UTF-8                 
 ctype    en_US.UTF-8                 
 tz       America/Los_Angeles         
 date     2022-02-13                  

─ Packages ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
 package      * version date       lib source        
 assertthat     0.2.1   2019-03-21 [1] CRAN (R 4.1.0)
 cachem         1.0.6   2021-08-19 [1] CRAN (R 4.1.0)
 callr          3.7.0   2021-04-20 [1] CRAN (R 4.1.0)
 cli            3.0.1   2021-07-17 [1] CRAN (R 4.1.0)
 crayon         1.4.1   2021-02-08 [1] CRAN (R 4.1.0)
 DBI            1.1.1   2021-01-15 [1] CRAN (R 4.1.0)
 desc           1.3.0   2021-03-05 [1] CRAN (R 4.1.0)
 devtools       2.4.2   2021-06-07 [1] CRAN (R 4.1.0)
 dplyr        * 1.0.7   2021-06-18 [1] CRAN (R 4.1.0)
 ellipsis       0.3.2   2021-04-29 [1] CRAN (R 4.1.0)
 fansi          0.5.0   2021-05-25 [1] CRAN (R 4.1.0)
 fastmap        1.1.0   2021-01-25 [1] CRAN (R 4.1.0)
 fs             1.5.0   2020-07-31 [1] CRAN (R 4.1.0)
 generics       0.1.0   2020-10-31 [1] CRAN (R 4.1.0)
 glue           1.4.2   2020-08-27 [1] CRAN (R 4.1.0)
 hms            1.1.0   2021-05-17 [1] CRAN (R 4.1.0)
 httr         * 1.4.2   2020-07-20 [1] CRAN (R 4.1.0)
 jsonlite     * 1.7.3   2022-01-17 [1] CRAN (R 4.1.2)
 lifecycle      1.0.1   2021-09-24 [1] CRAN (R 4.1.0)
 magrittr       2.0.1   2020-11-17 [1] CRAN (R 4.1.0)
 memoise        2.0.0   2021-01-26 [1] CRAN (R 4.1.0)
 pillar         1.6.2   2021-07-29 [1] CRAN (R 4.1.0)
 pkgbuild       1.2.0   2020-12-15 [1] CRAN (R 4.1.0)
 pkgconfig      2.0.3   2019-09-22 [1] CRAN (R 4.1.0)
 pkgload        1.2.2   2021-09-11 [1] CRAN (R 4.1.1)
 prettyunits    1.1.1   2020-01-24 [1] CRAN (R 4.1.0)
 processx       3.5.2   2021-04-30 [1] CRAN (R 4.1.0)
 progress       1.2.2   2019-05-16 [1] CRAN (R 4.1.0)
 ps             1.6.0   2021-02-28 [1] CRAN (R 4.1.0)
 purrr          0.3.4   2020-04-17 [1] CRAN (R 4.1.0)
 R6             2.5.1   2021-08-19 [1] CRAN (R 4.1.0)
 remotes        2.4.0   2021-06-02 [1] CRAN (R 4.1.0)
 rlang          0.4.11  2021-04-30 [1] CRAN (R 4.1.0)
 rprojroot      2.0.2   2020-11-15 [1] CRAN (R 4.1.0)
 rstudioapi     0.13    2020-11-12 [1] CRAN (R 4.1.0)
 sessioninfo    1.1.1   2018-11-05 [1] CRAN (R 4.1.0)
 testthat       3.0.4   2021-07-01 [1] CRAN (R 4.1.0)
 tibble         3.1.4   2021-08-25 [1] CRAN (R 4.1.0)
 tidygeocoder * 1.0.5   2021-11-02 [1] CRAN (R 4.1.0)
 tidyselect     1.1.1   2021-04-30 [1] CRAN (R 4.1.0)
 usethis        2.0.1   2021-02-10 [1] CRAN (R 4.1.0)
 utf8           1.2.2   2021-07-24 [1] CRAN (R 4.1.0)
 vctrs          0.3.8   2021-04-29 [1] CRAN (R 4.1.0)
 withr          2.4.2   2021-04-18 [1] CRAN (R 4.1.0)

Another similar error:

 > openCageTbl <-
+   tidygeocoder::reverse_geocode(
+     .tbl = dplyr::tibble(
+       lat = c(36.609014, 39.76213),
+       long = c(-121.829122, -10.00000)
+     ),
+     lat = "lat",
+     long = "long",
+     full_results = TRUE,
+     method = "opencage"
+   )
...
Error: Can't combine `..1$annotations.timezone.offset_string` <character> and `..2$annotations.timezone.offset_string` <integer>.

No solution yet, but after updating my packages I'm now able to reproduce this error with R 4.1.0.

library(tidygeocoder)
openCageTbl <-
  tidygeocoder::reverse_geocode(
    .tbl = dplyr::tibble(
      lat = c(36.609014, 34.735150),
      long = c(-121.829122, -120.532900)
    ),
    lat = "lat",
    long = "long",
    full_results = TRUE,
    method = "opencage"
  )
#> Passing 2 coordinates to the OpenCage single coordinate geocoder
#> Query completed in: 2 seconds
#> Error in `dplyr::bind_rows()` at tidygeocoder/R/reverse_geo.R:269:4:
#> ! Can't combine `..1$components.building` <character> and `..2$components.building` <integer>.

Created on 2022-02-15 by the reprex package (v2.0.1)

─ Session info ──────────────────────────────────────────────────────────────────────────────────────────────────
 setting  value
 version  R version 4.1.0 (2021-05-18)
 os       Ubuntu 21.10
 system   x86_64, linux-gnu
 ui       RStudio
 language (EN)
 collate  en_US.UTF-8
 ctype    en_US.UTF-8
 tz       America/New_York
 date     2022-02-15
 rstudio  1.4.1717 Juliet Rose (desktop)
 pandoc   2.9.2.1 @ /usr/bin/pandoc

─ Packages ──────────────────────────────────────────────────────────────────────────────────────────────────────
 package      * version date (UTC) lib source
 assertthat     0.2.1   2019-03-21 [1] CRAN (R 4.1.0)
 brio           1.1.3   2021-11-30 [1] CRAN (R 4.1.0)
 cachem         1.0.6   2021-08-19 [1] CRAN (R 4.1.0)
 callr          3.7.0   2021-04-20 [1] CRAN (R 4.1.0)
 cli            3.2.0   2022-02-14 [1] CRAN (R 4.1.0)
 crayon         1.5.0   2022-02-14 [1] CRAN (R 4.1.0)
 curl           4.3.2   2021-06-23 [1] CRAN (R 4.1.0)
 DBI            1.1.2   2021-12-20 [1] CRAN (R 4.1.0)
 desc           1.4.0   2021-09-28 [1] CRAN (R 4.1.0)
 devtools       2.4.3   2021-11-30 [1] CRAN (R 4.1.0)
 dplyr          1.0.8   2022-02-08 [1] CRAN (R 4.1.0)
 ellipsis       0.3.2   2021-04-29 [1] CRAN (R 4.1.0)
 fansi          1.0.2   2022-01-14 [1] CRAN (R 4.1.0)
 fastmap        1.1.0   2021-01-25 [1] CRAN (R 4.1.0)
 fs             1.5.2   2021-12-08 [1] CRAN (R 4.1.0)
 generics       0.1.2   2022-01-31 [1] CRAN (R 4.1.0)
 glue           1.6.1   2022-01-22 [1] CRAN (R 4.1.0)
 hms            1.1.1   2021-09-26 [1] CRAN (R 4.1.0)
 httr           1.4.2   2020-07-20 [1] CRAN (R 4.1.0)
 jsonlite       1.7.3   2022-01-17 [1] CRAN (R 4.1.0)
 lifecycle      1.0.1   2021-09-24 [1] CRAN (R 4.1.0)
 magrittr       2.0.2   2022-01-26 [1] CRAN (R 4.1.0)
 memoise        2.0.1   2021-11-26 [1] CRAN (R 4.1.0)
 pillar         1.7.0   2022-02-01 [1] CRAN (R 4.1.0)
 pkgbuild       1.3.1   2021-12-20 [1] CRAN (R 4.1.0)
 pkgconfig      2.0.3   2019-09-22 [1] CRAN (R 4.1.0)
 pkgload        1.2.4   2021-11-30 [1] CRAN (R 4.1.0)
 prettyunits    1.1.1   2020-01-24 [1] CRAN (R 4.1.0)
 processx       3.5.2   2021-04-30 [1] CRAN (R 4.1.0)
 progress       1.2.2   2019-05-16 [1] CRAN (R 4.1.0)
 ps             1.6.0   2021-02-28 [1] CRAN (R 4.1.0)
 purrr          0.3.4   2020-04-17 [1] CRAN (R 4.1.0)
 R6             2.5.1   2021-08-19 [1] CRAN (R 4.1.0)
 remotes        2.4.2   2021-11-30 [1] CRAN (R 4.1.0)
 rlang          1.0.1   2022-02-03 [1] CRAN (R 4.1.0)
 rprojroot      2.0.2   2020-11-15 [1] CRAN (R 4.1.0)
 rstudioapi     0.13    2020-11-12 [1] CRAN (R 4.1.0)
 sessioninfo    1.2.2   2021-12-06 [1] CRAN (R 4.1.0)
 testthat       3.1.2   2022-01-20 [1] CRAN (R 4.1.0)
 tibble         3.1.6   2021-11-07 [1] CRAN (R 4.1.0)
 tidygeocoder * 1.0.5   2022-01-29 [1] local
 tidyselect     1.1.1   2021-04-30 [1] CRAN (R 4.1.0)
 usethis        2.1.5   2021-12-09 [1] CRAN (R 4.1.0)
 utf8           1.2.2   2021-07-24 [1] CRAN (R 4.1.0)
 vctrs          0.3.8   2021-04-29 [1] CRAN (R 4.1.0)
 withr          2.4.3   2021-11-30 [1] CRAN (R 4.1.0)

 [1] /home/cambonator/R/x86_64-pc-linux-gnu-library/4.1
 [2] /usr/local/lib/R/site-library
 [3] /usr/lib/R/site-library
 [4] /usr/lib/R/library

─────────────────────────────────────────────────────────────────────────────────────────────────────────────────

@jonathancallahan I have a fix for you to test out on the fix-opencage-dtype branch. You can install it with:

devtools::install_github("jessecambon/tidygeocoder", ref = "fix-opencage-dtype")

The issue was that dplyr::bind_rows() is provided a list of dataframes (one dataframe per query) to combine in the geo() and reverse_geo() functions. An error will occur here if columns with the same name don't have compatible datatypes (ie. character and integer/numeric don't mix). Here's where this happens in the geo() function.

The solution I implemented goes through this list of dataframes and converts some columns to character (if necessary) to avoid the datatype conflict. Using this solution, I'm able to run both of the queries above without errors:

      tidygeocoder::reverse_geocode(
             .tbl = dplyr::tibble(
                   lat = c(36.609014, 39.76213),
                   long = c(-121.829122, -10.00000)
                 ),
             lat = "lat",
             long = "long",
             full_results = TRUE,
             method = "opencage"
           )
#> Passing 2 coordinates to the OpenCage single coordinate geocoder
#> Query completed in: 7.8 seconds
#> # A tibble: 2 × 78
#>     lat  long address               confidence annotations.MGRS annotations.Mai…
#>   <dbl> <dbl> <chr>                      <int> <chr>            <chr>           
#> 1  36.6 -122. Seaside Fire Departm…         10 10SFF0472152167  CM96co06mf      
#> 2  39.8  -10  North Atlantic Ocean           1 29SME1434401834  IM59as02av      
#> # … with 72 more variables: annotations.callingcode <int>,
#> #   annotations.flag <chr>, annotations.geohash <chr>, annotations.qibla <dbl>,
#> #   annotations.DMS.lat <chr>, annotations.DMS.lng <chr>,
#> #   annotations.FIPS.county <chr>, annotations.FIPS.state <chr>,
#> #   annotations.Mercator.x <dbl>, annotations.Mercator.y <dbl>,
#> #   annotations.OSM.edit_url <chr>, annotations.OSM.note_url <chr>,
#> #   annotations.OSM.url <chr>, …


   tidygeocoder::reverse_geocode(
     .tbl = dplyr::tibble(
       lat = c(36.609014, 34.735150),
       long = c(-121.829122, -120.532900)
     ),
     lat = "lat",
     long = "long",
     full_results = TRUE,
     method = "opencage"
   )
#> Passing 2 coordinates to the OpenCage single coordinate geocoder
#> Query completed in: 2 seconds
#> # A tibble: 2 × 77
#>     lat  long address               confidence annotations.MGRS annotations.Mai…
#>   <dbl> <dbl> <chr>                      <int> <chr>            <chr>           
#> 1  36.6 -122. Seaside Fire Departm…         10 10SFF0472152167  CM96co06mf      
#> 2  34.7 -121. 11193, New Mexico Av…         10 10SGD2585046434  CM94rr66aj      
#> # … with 71 more variables: annotations.callingcode <int>,
#> #   annotations.flag <chr>, annotations.geohash <chr>, annotations.qibla <dbl>,
#> #   annotations.DMS.lat <chr>, annotations.DMS.lng <chr>,
#> #   annotations.FIPS.county <chr>, annotations.FIPS.state <chr>,
#> #   annotations.Mercator.x <dbl>, annotations.Mercator.y <dbl>,
#> #   annotations.OSM.edit_url <chr>, annotations.OSM.note_url <chr>,
#> #   annotations.OSM.url <chr>, …

Created on 2022-02-25 by the reprex package (v2.0.1)

This fix does still have an issue with nested dataframes (ie. flatten = FALSE):

openCageTbl <-
      tidygeocoder::reverse_geocode(
             .tbl = dplyr::tibble(
                   lat = c(36.609014, 39.76213),
                   long = c(-121.829122, -10.00000)
                 ),
             lat = "lat",
             long = "long",
             full_results = TRUE,
             method = "opencage", flatten = FALSE
           )
#> Passing 2 coordinates to the OpenCage single coordinate geocoder
#> Query completed in: 10.5 seconds
#> Error in `dplyr::bind_rows()` at tidygeocoder/R/reverse_geo.R:266:4:
#> ! Can't combine `..1$annotations$timezone$offset_string` <character> and `..2$annotations$timezone$offset_string` <integer>.

Created on 2022-02-25 by the reprex package (v2.0.1)

The first example works for me now (with tidygeocoder v1.0.5 from CRAN), i.e. OpenCage returns components.building as a string for both coordinates.

annotations:timezone:offset_string (sic!) sometimes returns a number though.

library(tidygeocoder)

reverse_geo(0, 0, method = "opencage", full_results = TRUE)["annotations.timezone.offset_string"]
#> Passing 1 coordinate to the OpenCage single coordinate geocoder
#> Query completed in: 1.3 seconds
#> # A tibble: 1 × 1
#>   annotations.timezone.offset_string
#>                                <int>
#> 1                                  0
reverse_geo(39.76213, -10, method = "opencage", full_results = TRUE)["annotations.timezone.offset_string"]
#> Passing 1 coordinate to the OpenCage single coordinate geocoder
#> Query completed in: 1 seconds
#> # A tibble: 1 × 1
#>   annotations.timezone.offset_string
#>                                <int>
#> 1                               -100

The query URL is https://api.opencagedata.com/geocode/v1/json?q=0,0&limit=1&key=MY_OC_KEY

@freyfogle, should this be fixed on your end?

BTW I found an old note where I had a similar problem with components.neighbourhood being returned as a number, but I didn't write down the specific example (probably a German address) 🤷🏻‍♂️

thanks, yes, this should be fixed on our end. Looking now

ok fix is now live, offset_string should not always be a string

        "timezone": {
          "name": "UTC",
          "now_in_dst": 0,
          "offset_sec": 0,
          "offset_string": "+0000",
          "short_name": "UTC"
        },