ropensci / CoordinateCleaner

Automated flagging of common spatial and temporal errors in biological and palaeontological collection data, for the use in conservation, ecology and palaeontology.

Home Page:https://docs.ropensci.org/CoordinateCleaner/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

`clean_coordinates()` occasional error with outliers test

AMBarbosa opened this issue · comments

With some datasets, I'm getting an error when using the outliers test. Here's an example where this happens:

dat <- read.csv("https://raw.githubusercontent.com/AMBarbosa/files/main/elephant.csv")

dat_cc <- clean_coordinates(dat, lon = "decimalLongitude", lat = "decimalLatitude", 
          species = "species", tests = "outliers")

# Testing coordinate validity
# Flagged 0 records.
# Testing geographic outliers
# Error in h(simpleError(msg, call)) : 
#   error in evaluating the argument 'x' in selecting a method for function 'ext': [vect] the variable name(s) in argument `geom` are not in `x`
# In addition: Warning message:
#   In cc_outl(otl_test, lon = lon, lat = lat, species = species, method = outliers_method,  :
#                Using raster approximation.

packageVersion("CoordinateCleaner")
# [1] ‘3.0’

I am running in the same issues. My first guess is that this is based on ras_create(), and more specifically the terra::vect() function. Currently the function doesn't consider the name of the lon/lat col names.

ex <- terra::ext(terra::vect(x[, c(lon, lat)], geom=c(lon, lat))) + thinning_res * 2

Also, I think the new name of terra objects is lyr.1 and not layer

I think my PR #92 might fix all this, but please double check :)

The terra::vect() function takes "lon" and "lat" as the default coordinate column names; but in CoordinateCleaner::cc_outl(), the default values for lon and lat are "decimalLongitude" and "decimalLatitude" respectively, so they do need to be specified to terra::vect(). @mhesselbarth's PR should fix it. I'm just a bit puzzled as to why the error doesn't occur with all datasets that have outliers.

I think the issue only occurs if a raster is used to detect outliers, which is not the case for all species? But I haven't looked into that too much.