cc_coun() used to work. After update it broke

Question

cc_coun() used to work. After update it broke

zapataf opened this issue 8 months ago · comments

I have not been able to figure out what happened with cc_coun() after I updated to CoordinateCleaner 3.0. I fixed the column name to countrycode instead of countryCode and changed to iso3c all codes. But I am getting this error:

Error in `x[out, ]`:
! Can't subset rows with `out`.
✖ Logical subscript `out` must be size 1 or 4635708, not 4635709.
Run `rlang::last_trace()` to see where the error occurred.
Warning message:
In as.character(country) == count_dat :
  longer object length is not a multiple of shorter object length

I have not been able to figure out where and what is the problem. I cannot share a reproducible error because I have about 5 million records and I cannot tell where is the issue.. When I ran rlang::last_trace() I get this (which did not help me):

Backtrace:
     ▆
  1. ├─indata_angiosperms %>% cc_coun(iso3 = "countrycode")
  2. ├─CoordinateCleaner::cc_coun(., iso3 = "countrycode")
  3. │ ├─x[out, ]
  4. │ └─tibble:::`[.tbl_df`(x, out, )
  5. │   └─tibble:::vectbl_as_row_index(i, x, i_arg)
  6. │     └─tibble:::vectbl_as_row_location(i, nr, i_arg, assign, call)
  7. │       ├─tibble:::subclass_row_index_errors(...)
  8. │       │ └─base::withCallingHandlers(...)
  9. │       └─vctrs::vec_as_location(...)
 10. └─vctrs (local) `<fn>`()
 11.   └─vctrs:::stop_indicator_size(...)
Run rlang::last_trace(drop = FALSE) to see 1 hidden frame.

Any help/guidance would be appreciated

Bruno Vilela de Moraes e Silva · Answer 1 · Fri Nov 17 2023 03:17:19 GMT+0800 (China Standard Time)

HI @zapataf, could you send the data you are using, so I can replicate the error?

Felipe Zapata · Answer 2 · Fri Nov 17 2023 15:57:15 GMT+0800 (China Standard Time)

Hi! thanks for getting back. Unfortunately it's generating the error only when I try my full dataset, a tibble of ~90MB. I have tried random sampling of the tibble to generate something easier to share, but curiously in this smaller tibbles I don't get the error. So now I wonder if the issues could be related to the removal of rows when the tibble is really large (about 4 M rows).

For now, I decided to not use this filter anymore and keep my analyses moving.