plantnet / GeoLifeCLEF

GeoLifeCLEF challenge toolkit and starter code

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Error extracting raster data corresponding to US observations from the Florida Keys

elijahcole opened this issue · comments

(We've discussed this issue a bit internally, but I thought it would be good to create an issue to keep track of it.)

It seems that the rasters for the US don't quite far enough south, so there are a handful of observations which produce an error when their coordinates are run through environmental_raster_glc.py. More precisely, I found 796 US train patches and 98 US test patches which result in an "index out of bounds" error.

Here are two possible solutions:

  • Replace the US rasters with new ones that extend further south.
  • Return appropriate "no data" values for locations outside the raster.

(Good idea to handle this through an issue here!)

I think updating the US rasters would be the most appropriate.
I'm not sure that returning data with missing values is the best solution as in most cases we would want it to fail in the case where the locations fall outside the rasters (in most case, this is likely to be due to a bug elsewhere in the code like using a wrong location).
Potentially, it could be possible to add a parameter to PatchExtractor to control the behavior when that scenario occurs, but I don't think it is a good idea to have it as default behavior.
What do you think?