jessecambon / tidygeocoder

Geocoding Made Easy

Home Page:https://jessecambon.github.io/tidygeocoder

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Geocodio Batch Geocoding: postal_code parameter is not properly provided

ottothecow opened this issue · comments

Description

When using the geocodio batch geocoding API, postal codes are not passed to the server and used to generate the result.

This is caused by tidygeocoder passing the postal code parameter as postalcode rather than postal_code as expected by the API. If you go into a debug session and trace it to the actual response object, you'll find that the API does respond with this warning (but its not sent to the end user of the library):

"Ignoring parameter "postalcode" as it was not expected. Did you mean "postal_code"? See full list of valid parameters here: https://www.geocod.io/docs/"

The correct parameter is present in api_parameter_reference but it appears that reference is not used during the batch geocoding process when the query is constructed

Solution seems to simply change the parameter to match. Possible feature enhancement would be to implement a way to pass warnings such as the above back to the end user.

Steps to Reproduce

Geocode any address and inspect the query to see that parameter post_code was not sent to the API.

Here as an example I've included a poor-quality address that geocodes differently depending on whether or not a zip code is supplied. When it is run in batch mode (with the included White House address at 1600 Pennsylvania) it finds a location with a different zip code. When it is run as a single observation, it finds a different match that has the same zip code.

library(dplyr)
library(tidygeocoder)

test = data.frame(
  ADDRESS = c('1600 Pennsylvania Avenue NW', '18 HIGHWAY 433'),
  CITY = c('WASHINGTON', 'MACKVILLE'),
  STATE = c('DC','KY'),
  ZIP5 = c('20500','40040')
)

geocodio = function(x){
  x %>% geocode(street=ADDRESS,
                city=CITY,
                state=STATE,
                postalcode=ZIP5,
                verbose=TRUE,
                method='geocodio',
                full_results=TRUE)
}

batch = test %>% 
  geocodio() %>% 
  slice(2) %>% 
  select(ADDRESS:accuracy_type)
individual = test %>% 
  slice(2) %>%
  geocodio() %>%
  select(ADDRESS:accuracy_type)
  
all.equal(batch,individual)

Your Contribution

I believe this is a simple fix requiring re-naming the parameters prior to submitting the batch request. I will attempt to put together a sensible pull request that does this.

Thanks! I haven't gotten a chance to test this yet, but it makes sense and and the PR looks good to me. Looks like for Geocodio batch geocoding we were just passing a dataframe with the 'generic' API parameter names as column names instead of the API specific names.

I also agree it would be ideal to pass warning messages like that along. It looks like currently we are only doing that when the HTTP response is not 200 in single address geocoding (code link) and it varies by which service you use in batch geocoding.