ropensci / CoordinateCleaner

Automated flagging of common spatial and temporal errors in biological and palaeontological collection data, for the use in conservation, ecology and palaeontology.

Home Page:https://docs.ropensci.org/CoordinateCleaner/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

cc_coun() used to work. After update it broke

zapataf opened this issue · comments

I have not been able to figure out what happened with cc_coun() after I updated to CoordinateCleaner 3.0. I fixed the column name to countrycode instead of countryCode and changed to iso3c all codes. But I am getting this error:

Error in `x[out, ]`:
! Can't subset rows with `out`.
✖ Logical subscript `out` must be size 1 or 4635708, not 4635709.
Run `rlang::last_trace()` to see where the error occurred.
Warning message:
In as.character(country) == count_dat :
  longer object length is not a multiple of shorter object length

I have not been able to figure out where and what is the problem. I cannot share a reproducible error because I have about 5 million records and I cannot tell where is the issue.. When I ran rlang::last_trace() I get this (which did not help me):

Backtrace:
     ▆
  1. ├─indata_angiosperms %>% cc_coun(iso3 = "countrycode")
  2. ├─CoordinateCleaner::cc_coun(., iso3 = "countrycode")
  3. │ ├─x[out, ]
  4. │ └─tibble:::`[.tbl_df`(x, out, )
  5. │   └─tibble:::vectbl_as_row_index(i, x, i_arg)
  6. │     └─tibble:::vectbl_as_row_location(i, nr, i_arg, assign, call)
  7. │       ├─tibble:::subclass_row_index_errors(...)
  8. │       │ └─base::withCallingHandlers(...)
  9. │       └─vctrs::vec_as_location(...)
 10. └─vctrs (local) `<fn>`()
 11.   └─vctrs:::stop_indicator_size(...)
Run rlang::last_trace(drop = FALSE) to see 1 hidden frame.

Any help/guidance would be appreciated

HI @zapataf, could you send the data you are using, so I can replicate the error?

Hi! thanks for getting back. Unfortunately it's generating the error only when I try my full dataset, a tibble of ~90MB. I have tried random sampling of the tibble to generate something easier to share, but curiously in this smaller tibbles I don't get the error. So now I wonder if the issues could be related to the removal of rows when the tibble is really large (about 4 M rows).

For now, I decided to not use this filter anymore and keep my analyses moving.