Filter for known defaults of coordinate uncertainty in meters
jhnwllr opened this issue · comments
There are several known default values for coordinate uncertainty in meters.
301 : Geolocate Default (often a country centroid)
3036 : Geolocate Default
999 : Default found in a few datasets (observations.org)
9999 : Large default
occurrence counts
630 353 -- 3036m
401 507 -- 301m
370 553 -- 999m
14 242 -- 9999m
I think CoordinateCleaner could have a function for these filtering these known defaults. I would be happy to make a PR for such a function...
Hi John,
thanks for the excellent suggestion. I'll implement this for the next version. Two questions:
- What do you suggest as default name for the column with the uncertainty in meters, since this will be user provided
- My impression is that default values may also cause problems in other entry fields. For instance, the individualCount. What do you think about an option to flag those as well?
Thanks!!
I don't have any opinions about individualCount right now.
My assumption would be that there might be some default values there. GBIF has recently done a good job of trying to cleaning up that column. Since GBIF now has the occurrence_status field: https://www.gbif.org/occurrence/search?taxon_key=4689&occurrence_status=present
What do you suggest as default name for the column with the uncertainty in meters, since this will be user provided
I would name the issue or column something like "known_default_coordinate_uncertainty"