weecology / DeepForest

Python Package for Airborne RGB machine learning

Home Page:https://deepforest.readthedocs.io/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

CRS not matching in utilities.shapefile_to_annotations

yby026 opened this issue · comments

Hi, when I try use shapefile_to_annotations to convert my labels from a shapefile, an error raises as the crs does not match between the image and shapefile. It seems the crs of the shapefile containing an unknown vertical crs but not in the image, and the vertical crs cannot be removed from shapefile.

Codes:

ann = utilities.shapefile_to_annotations(shapefile = annotation + "bounding_box.shp",
    rgb = img)

ann.shape

The error message is here:

ValueError Traceback (most recent call last)
in <cell line: 1>()
----> 1 ann = utilities.shapefile_to_annotations(shapefile = annotation +"bounding_box.shp",
2 rgb = img)
3
4 ann.shape

/usr/local/lib/python3.10/dist-packages/deepforest/utilities.py in shapefile_to_annotations(shapefile, rgb, buffer_size, geometry_type, savedir)
298 # Check matching the crs
299 if not gdf.crs.to_string() == raster_crs.to_string():
--> 300 raise ValueError("The shapefile crs {} does not match the image crs {}".format(
301 gdf.crs, src.crs))
302

ValueError: The shapefile crs COMPD_CS["NZGD2000 / New Zealand Transverse Mercator 2000 + unknown",PROJCS["NZGD2000 / New Zealand Transverse Mercator 2000",GEOGCS["NZGD2000",DATUM["New_Zealand_Geodetic_Datum_2000",SPHEROID["GRS 1980",6378137,298.257222101,AUTHORITY["EPSG","7019"]],AUTHORITY["EPSG","6167"]],PRIMEM["Greenwich",0],UNIT["Degree",0.0174532925199433]],PROJECTION["Transverse_Mercator"],PARAMETER["latitude_of_origin",0],PARAMETER["central_meridian",173],PARAMETER["scale_factor",0.9996],PARAMETER["false_easting",1600000],PARAMETER["false_northing",10000000],UNIT["metre",1,AUTHORITY["EPSG","9001"]],AXIS["Easting",EAST],AXIS["Northing",NORTH]],VERT_CS["unknown",VERT_DATUM["unknown",2005],UNIT["metre",1,AUTHORITY["EPSG","9001"]],AXIS["Gravity-related height",UP]]] does not match the image crs PROJCS["NZGD2000 / New Zealand Transverse Mercator 2000",GEOGCS["NZGD2000",DATUM["New_Zealand_Geodetic_Datum_2000",SPHEROID["GRS 1980",6378137,298.257222101004,AUTHORITY["EPSG","7019"]],AUTHORITY["EPSG","6167"]],PRIMEM["Greenwich",0],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AUTHORITY["EPSG","4167"]],PROJECTION["Transverse_Mercator"],PARAMETER["latitude_of_origin",0],PARAMETER["central_meridian",173],PARAMETER["scale_factor",0.9996],PARAMETER["false_easting",1600000],PARAMETER["false_northing",10000000],UNIT["metre",1,AUTHORITY["EPSG","9001"]],AXIS["Easting",EAST],AXIS["Northing",NORTH],AUTHORITY["EPSG","2193"]]

The shapefile was created in arcgis pro 3.1 based on the projection of image, and both shapefile and image has been reprojected use each other's coordination system before training.

Codes are running using google colab.

Not sure what would be the problem here.

This error suggests that DeepForest thinks the CRS's don't match, so let's get a better look at how they are stored. Can you run the following code replacing shapefile with the path to your shapefile and rgb with the path to your raster and then post the output.

import geopandas as gpd

gdf = gpd.read_file(shapefile)
print(f"Shapefile CRS is:\n{gdf.crs}")

with rasterio.open(rgb) as src:
    raster_crs = src.crs
print(f"Raster CRS is:\n{raster_crs}

Here's the output:

Shapefile CRS is:
COMPD_CS["NZGD2000 / New Zealand Transverse Mercator 2000 + unknown",PROJCS["NZGD2000 / New Zealand Transverse Mercator 2000",GEOGCS["NZGD2000",DATUM["New_Zealand_Geodetic_Datum_2000",SPHEROID["GRS 1980",6378137,298.257222101,AUTHORITY["EPSG","7019"]],AUTHORITY["EPSG","6167"]],PRIMEM["Greenwich",0],UNIT["Degree",0.0174532925199433]],PROJECTION["Transverse_Mercator"],PARAMETER["latitude_of_origin",0],PARAMETER["central_meridian",173],PARAMETER["scale_factor",0.9996],PARAMETER["false_easting",1600000],PARAMETER["false_northing",10000000],UNIT["metre",1,AUTHORITY["EPSG","9001"]],AXIS["Easting",EAST],AXIS["Northing",NORTH]],VERT_CS["unknown",VERT_DATUM["unknown",2005],UNIT["metre",1,AUTHORITY["EPSG","9001"]],AXIS["Gravity-related height",UP]]]
Raster CRS is:
PROJCS["NZGD2000 / New Zealand Transverse Mercator 2000",GEOGCS["NZGD2000",DATUM["New_Zealand_Geodetic_Datum_2000",SPHEROID["GRS 1980",6378137,298.257222101004,AUTHORITY["EPSG","7019"]],AUTHORITY["EPSG","6167"]],PRIMEM["Greenwich",0],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AUTHORITY["EPSG","4167"]],PROJECTION["Transverse_Mercator"],PARAMETER["latitude_of_origin",0],PARAMETER["central_meridian",173],PARAMETER["scale_factor",0.9996],PARAMETER["false_easting",1600000],PARAMETER["false_northing",10000000],UNIT["metre",1,AUTHORITY["EPSG","9001"]],AXIS["Easting",EAST],AXIS["Northing",NORTH],AUTHORITY["EPSG","2193"]]

OK, so ArcGIS is storing two things with the same project using different text representations, which is a little weird. Can you run this code to see if we can get a match this way:

import geopandas as gpd

gdf = gpd.read_file(shapefile)
print(f"Shapefile CRS is:\n{gdf.crs.to_epsg()}")

with rasterio.open(rgb) as src:
    raster_crs = src.crs.to_epsg()
print(f"Raster CRS is:\n{raster_crs}

Hi there,

After running the code it returned as:
image

OK, thanks. It looks like something about the way the projections were generated in ArcGIS is non-standard causing Python's spatial tools to have trouble working with them. For example, with a normal shapefile this is what happens with the first 3 lines of that code:

>>> import geopandas as gpd
>>> shapefile = "/path/to/file.shp"
>>> gdf = gpd.read_file(shapefile)
>>> gdf.crs.to_epsg()
32617

The number produced will differ depending on the projection.

I'd recommend trying to reproject both spatial layers into a specific EPSG code, which looks like it's probably 2193 for you (New Zealand Transverse Mercator 2000). I'd probably try to do this in ArcGIS to get out files with standard projection information since I'm not sure whether other tools will be able to work with this projection metadata.

You could also try changing the projection (without reprojecting) once you're in Python (e.g., gdf.set_crs(epsg=2193, inplace=True). As long as this is just a metadata problem this would probably fix your issue (if you also make the equivalent change for the raster), but personally I'd be more comfortable getting the files our of the tool I created them in with a clean chunk of projection metadata.

Closing this since we haven't heard back in a couple of months. @yby026 - definitely feel free to reopen if you come back to this.