jessecambon / tidygeocoder

Geocoding Made Easy

Home Page:https://jessecambon.github.io/tidygeocoder

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

cascade method and custom api url?

vinhdizzo opened this issue · comments

Thanks for such a wonderful package. I am interesting in using the geocode function with parameters method='cascade' and cascade_order=c('osm', 'census'), but with a local nominatim server. The latter would require the use of api_url parameter. What is the best way to accomplish this? I don't believe the api_url parameter would work with the cascade method.

Thanks!

Hi @vinhdizzo, I think the best way to go about this would be to call geocode() separately and combine the results. That way you can set the API URL in the first query properly without affecting the second query. You could do something like this:

library(tidyverse)
library(tidygeocoder)

# osm query
results1 <- your_input_dataframe %>%
   geocode(address = address_column_name, method = 'osm', api_url ="YourURLHere") %>% 
   # create a column to indicate if a result was found
   mutate(not_found = is.na(lat) | is.na(long))

# census query
results2 <- results1 %>%
    filter(not_found == TRUE) %>%
    geocode(method = 'census', address = address_column_name)

# Combine the results from the first and second query
combine_results <- bind_rows(
    results1 %>% filter(not_found == FALSE) %>% select(-not_found),
    results2
   )

One idea for a future version of {tidygeocoder} is to add a convenience function to streamline the above process (in place of the current cascade method). This could work by passing the arguments for the two queries above as a named list like this:

your_input_dataframe %>%
proposed_cascade_function(address = address_col_name,
   query1 = list(method = 'osm', api_url = "YourURLHere"), 
   query2 = list(method = 'census')
) 

Thank you for the suggested code. I had a feeling I had to go down a manual path. All good :)

@vinhdizzo FYI the development version (main branch) of the package now has a geocode_combine() function which would fit your use case:

library(tidygeocoder)
geocode_combine(
  sample_addresses, # input dataframe
  #  query parameters are provided as a list of lists
  queries = list(
      list(method = 'osm', api_url = "InsertYourAPIURLHERE"), 
      list(method = 'census')
   ),
  # address column name is the same for both queries
  global_params = list(address = 'addr'))

@jessecambon sweet, thank you for being so responsive!