GeoNode / geonode-importer

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Dataset overwriting

sysnux opened this issue · comments

Hi,

I have installed the app on a test instance of GeoNode: I have successfully imported GeoPackage and GeoJSON datasets through the UI, thanks for that!

I also imported successfully via the API (/api/v2/uploads/upload/) with a small test script.

Now I'd like to update a dataset through the API, with no success. The README says overwrite_existing_layer should be supported through the API, but the reference to the docs is broken. I tried a new upload via the API (/api/v2/uploads/upload/) with overwrite_existing_layer=True parameter, but this is creating a new dataset.

I also tried PATCH /api/v2/datasets/xx/replace . I'm getting:
<Response [500]> {"success":false,"errors":["Not a valid string."],"code":"invalid_dataset_exception"}

What did I do wrong?

So I have been able to update a dataset by changing:

diff --git a/importer/handlers/common/vector.py b/importer/handlers/common/vector.py
index 3a56e16..a26a2b3 100644
--- a/importer/handlers/common/vector.py
+++ b/importer/handlers/common/vector.py
@@ -356,9 +356,7 @@ class BaseVectorFileHandler(BaseHandler):
 
         dataset_exists = dataset_available.exists()
 
-        if dataset_exists and should_be_overwritten:
-            alternate = dataset_available.first().alternate
-        elif not dataset_exists:
+        if not dataset_exists or dataset_exists and should_be_overwritten:
             alternate = layer_name
         else:
             alternate = create_alternate(layer_name, str(_exec_obj.exec_id))

Works for me, but I'm new to GeoNode development, and I'd appreciate to have a comment from someone who knows better.

Hi @sysnux
To overwrite an existing layer is enough to call the upload API by passing the override_existing_layer=True as an additional parameter.
NOTE: be sure that the layer name you are providing matches the alternate that was generated (if is the only dataset available will have the layer_name=alternate), otherwise if you have multiple layers with the same name the system will append a MD5 to make it unique.

For example:

  • layer_name = stations
  • Does not exist a dataset with the alternate workspace:layer_name -> layer_name / alternate => stations
  • Exists a dataset with the alternate workspace:layer_name -> layer_name / alternate => stations / stations_018735210827d5063ed0a1

So you want to overwrite the second layer, the layer name provided must match stations_018735210827d5063ed0a1

With your change, you are risking overwriting an unwanted layer. Two layers can have the same name but have different data in them.

Can you please try to roll back the changes and follow the above point to see if it works?
Thanks

i'll close due the lack of feedback

Hello,
I am reopening this ticket as I think I am facing the same problem with a geotiff file ( I see that there are no end2end test case for that).
It seems to work for geopackage,shp, geojson (test OK) etc but not raster files.
The Error that I have is :
"Failed to save to Geoserver catalog: 500, Store 'geonode:sdfoklfqls' already exists in workspace 'geonode'"

I tested with a new geotiff name to make sure that there is no multiple layers with same name.

So first import works fine.
Second import fails,
The rollback task removes the raster in geoserver and I can publish again on the empty geoserver.

When debugging the "_publisher.get_resource(alternate)" from the line

seems to always return None for the geotiff