ropensci / CoordinateCleaner

Automated flagging of common spatial and temporal errors in biological and palaeontological collection data, for the use in conservation, ecology and palaeontology.

Home Page:https://docs.ropensci.org/CoordinateCleaner/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Number of records detected: difference between the number for each test and the summary

luroy opened this issue · comments

Dear all,

I’ve just started using CoordinateCleaner to flag and remove problematic records from the Gbif package. So I apologize in advance if my following comment is due to my unfamiliarity with this package.

From a given input (GBIF records of species occurrence), I ran the clean_coordinates function as follows:

flags<-clean_coordinates(gbif_data_df,
lon = "decimalLongitude",
lat = "decimalLatitude",
countries = "countryCode2",
species = "species",
tests = c("capitals", "centroids", "duplicated", "equal", "gbif", "institutions", "outliers", "zeros"))

Here, I obtained the following results in my R console:

Testing coordinate validity
Flagged 0 records.
Testing equal lat/lon
Flagged 0 records.
Testing zero coordinates
Flagged 0 records.
Testing country capitals
Flagged 10 records.
Testing country centroids
Flagged 3 records.
Testing geographic outliers
Flagged 44 records.
Testing GBIF headquarters, flagging records around Copenhagen
Flagged 0 records.
Testing biodiversity institutions
Flagged 0 records.
Flagged 10 of 12187 records, EQ = 0

summary(flags)
.val .equ .zer .cap .cen .otl .gbf .inst
0 0 0 10 3 0 0 0
.summary
10

If the 3 "centroids" flagged records seems to be comprised within the 10 "country capitals" records, I don't understand why the "geographic outliers" flagged records are not shown in the summary, nor in the "flags" object.

Thank you for your attention to this matter,

Léa