eblondel / ows4R

R Interface for OGC Web-Services (OWS)

Home Page:https://eblondel.github.io/ows4R/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

WFS problems with geometry column and text conversion

jannes-m opened this issue · comments

Hi Emmanuel,
first, thank you again for developing OGC web services for R! I have played around a bit with your package, and stumbled across two potential bugs.

  1. When loading data into R, you seem to look for a geometry column (or tag), however, sometimes the column (tag) containing the geometry is unfortunately not named geometry, as is the case with this Dutch example where it is called "geometrie". The solution might be to use sf to read the downloaded GML file into R (see below, and FYI @Robinlovelace and @Nowosad):
library(ows4R)
#> Loading required package: geometa
#> Loading ISO 19139 XML schemas...
#> Loading ISO 19115 codelists...
library(sf)
#> Linking to GEOS 3.6.2, GDAL 2.3.2, proj.4 5.1.0
base_url = "http://geodata.nationaalgeoregister.nl/bag/wfs"
wfs = WFSClient$new(base_url, "1.0.0", logger = "INFO")
#> [ows4R][INFO] OWSGetCapabilities - Fetching http://geodata.nationaalgeoregister.nl/bag/wfs?service=WFS&version=1.0.0&request=GetCapabilities
caps = wfs$getCapabilities()
tmp = caps$findFeatureTypeByName("")
# find out about the available feature types
sapply(tmp, function(x) x$getName())
#> [1] "bag:ligplaats"       "bag:pand"            "bag:standplaats"    
#> [4] "bag:verblijfsobject" "bag:woonplaats"
# ok, let's download the national parcs of Bavaria
# ft = caps$findFeatureTypeByName("lfu:nationalpark")
ft = caps$findFeatureTypeByName("bag:ligplaats")
# download features
feats = ft$getFeatures()  
#> [ows4R][INFO] WFSGetFeature - Fetching http://geodata.nationaalgeoregister.nl/bag/wfs?service=WFS&version=1.0.0&typeName=bag:ligplaats&logger=INFO&request=GetFeature 
#> [ows4R][INFO] WFSDescribeFeatureType - Fetching http://geodata.nationaalgeoregister.nl/bag/wfs?service=WFS&version=1.0.0&typeName=bag:ligplaats&request=DescribeFeatureType
#> Error in if (element$getType() == "geometry") {: argument is of length zero
# obviously getFeatures() is looking for a geometry column, however, here it
# is named "geometrie", that's why it doesn't work this does not work
# However, it downloads the gml file to the temporary directory, hence,
# we can load the data into R ourselves
file = grep("gml", dir(tempdir()), value = TRUE)
file = file.path(tempdir(), file)
# assuming there is only one file
layer = read_sf(file)
plot(layer$geometrie)

Created on 2018-10-19 by the reprex package (v0.2.1)

  1. I also tried a German WFS. Here, it seems, there is some problem when using switch():
library(ows4R)
#> Loading required package: geometa
#> Loading ISO 19139 XML schemas...
#> Loading ISO 19115 codelists...
library(sf)
#> Linking to GEOS 3.6.2, GDAL 2.3.2, proj.4 5.1.0
base_url = "http://www.lfu.bayern.de/gdi/wfs/naturschutz/schutzgebiete?"
wfs = WFSClient$new(base_url, "1.0.0", logger = "INFO")
#> [ows4R][INFO] OWSGetCapabilities - Fetching http://www.lfu.bayern.de/gdi/wfs/naturschutz/schutzgebiete?service=WFS&version=1.0.0&request=GetCapabilities
caps = wfs$getCapabilities()
tmp = caps$findFeatureTypeByName("")
# find out about the available feature types
sapply(tmp, function(x) x$getName())
#> [1] "lfu:vogelschutzgebiet"          "lfu:naturpark"                 
#> [3] "lfu:naturschutzgebiet"          "lfu:landschaftsschutzgebiet"   
#> [5] "lfu:fauna_flora_habitat_gebiet" "lfu:biosphaerenreservat"       
#> [7] "lfu:nationalpark"
ft = caps$findFeatureTypeByName("lfu:nationalpark")
feats = ft$getFeatures()  
#> [ows4R][INFO] WFSGetFeature - Fetching http://www.lfu.bayern.de/gdi/wfs/naturschutz/schutzgebiete?service=WFS&version=1.0.0&typeName=lfu:nationalpark&logger=INFO&request=GetFeature 
#> [ows4R][INFO] WFSDescribeFeatureType - Fetching http://www.lfu.bayern.de/gdi/wfs/naturschutz/schutzgebiete?service=WFS&version=1.0.0&typeName=lfu:nationalpark&request=DescribeFeatureType
#> Error in switch(type, `xsd:string` = "character", `xsd:long` = "numeric", : EXPR must be a length 1 vector
# here, there seems to be a conversion error...
# still, we can load the data into R ourselves
file = grep("gml", dir(tempdir()), value = TRUE)
file = file.path(tempdir(), file)
layer = read_sf(file)
#> Warning in CPL_read_ogr(dsn, layer, as.character(options), quiet, type, :
#> GDAL Error 1: HTTP error code : 404
plot(layer$geometry)

Created on 2018-10-19 by the reprex package (v0.2.1)

Thanks, I will have a look. I already use "sf" as main binding for simple features. The problem is not really about the binding but the fact that WFS allows to return geometryless data. I will look to a better (more generic) way to handle this. Some of the recent GML improvements in geometa might help. I will let you know. 2d point seems to be related to a missing data type.

I'm also aware of some issues when dealing with national sources, there might be issues with parsing properly national CRS not parsable from their EPSG code.

The source you point out will be a good test case.

Wow great to see upstream contributions on this. Many thanks for quick-fire response @eblondel !

@jannes-m i'm going to have a look to it asap, i'm going to split your report into 2 separate tickets, to better track bugs/enhancements.

btw, looking at the code you run above, if you want to list the layers, there is a method "getFeatureTypes" that you can use:

sapply(caps$getFeatureTypes(), function(x){x$getName()})

@jannes-m your point 1 above is solved with #19 . For the 2d one, this requires an enhancement as it deals with feature type with type elements restrictions that i never had to deal with in WFS. I will look at time once i have free time.

Ok it should be ok now the both sources you've shared.

In case you get other issues, you can raise another ticket. You should not need to read through sf directly, ows4R is doing it for you but also checks your data against its schema, handled by the FeatureType description). So the natural binding for this WFS client is "sf" (being an implementation of OGC standard), and you can then run as(yourobject, "Spatial"). Note that in case you would have a geometryless dataset, the binding will be a simple data.frame.

Excellent, fantastic work!! Only this warning message might cause confusion when getting the Bavarian National Parcs:

Warning messages:
1: In CPL_read_ogr(dsn, layer, query, as.character(options), quiet,  :
  GDAL Error 1: HTTP error code : 404
2: st_crs<- : replacing crs does not reproject data; use st_transform for that 

I would probably suppress it since you are only using the getDefaulCRS() I suppose.

@jannes-m yes i've seen this warning, I will try to find sometimes to better understand why we get this, and suppress it if it's ok.

let me have a quick look at the source code but I guess you are using st_crs to set the CRS, and apparently there is already a crs but yes, let me take a look

in your code I found:

ows4R/R/WFSFeatureType.R

Lines 247 to 249 in c1557c2

if(hasGeometry){
ftFeatures <- sf::st_read(destfile, quiet = TRUE)
st_crs(ftFeatures) <- self$getDefaultCRS()

The warning message is issued since st_read() has already found a CRS in the Bavarian National Parc case. I don't know if this is usually the case with .gml files but I would guess so, hence you could just delete the st_crs() line. Just to be on the safe side you could also ask if the sf-object already has a CRS, and only if this is not the case, you assign the CRS. What do you think @Robinlovelace @Nowosad?

I've checked and the warning is not raised by this piece of code, I will investigate further and let you know. overwriting the CRS has no effect because it will be the same (normally) as the default CRS specified as per the service capabilities featuretype metadata.

Are you sure? Have a look at this reproducible example:

library(ows4R)
#> Loading required package: geometa
#> Loading ISO 19139 XML schemas...
#> Loading ISO 19115 codelists...
library(sf)
#> Linking to GEOS 3.7.0, GDAL 2.3.2, PROJ 5.2.0

base_url = "http://www.lfu.bayern.de/gdi/wfs/naturschutz/schutzgebiete?"
wfs = WFSClient$new(base_url, "1.0.0", logger = "INFO")
#> [ows4R][INFO] OWSGetCapabilities - Fetching http://www.lfu.bayern.de/gdi/wfs/naturschutz/schutzgebiete?service=WFS&version=1.0.0&request=GetCapabilities
caps = wfs$getCapabilities()
tmp = caps$findFeatureTypeByName("")
# find out about the available feature types
# sapply(tmp, function(x) x$getName())
ft = caps$findFeatureTypeByName("lfu:nationalpark")
feats = ft$getFeatures()  
#> [ows4R][INFO] WFSGetFeature - Fetching http://www.lfu.bayern.de/gdi/wfs/naturschutz/schutzgebiete?service=WFS&version=1.0.0&typeName=lfu:nationalpark&logger=INFO&request=GetFeature 
#> [ows4R][INFO] WFSDescribeFeatureType - Fetching http://www.lfu.bayern.de/gdi/wfs/naturschutz/schutzgebiete?service=WFS&version=1.0.0&typeName=lfu:nationalpark&request=DescribeFeatureType
#> Warning in CPL_read_ogr(dsn, layer, query, as.character(options), quiet, :
#> GDAL Error 1: HTTP error code : 404
#> Warning: st_crs<- : replacing crs does not reproject data; use st_transform
#> for that

# now reading the features manually (using more or less your code)
destfile = grep(".gml", dir(tempdir()), value = TRUE)
destfile = file.path(tempdir(), destfile)
ftFeatures = sf::st_read(destfile, quiet = TRUE)
st_crs(ftFeatures) = ft$getDefaultCRS()
#> Warning: st_crs<- : replacing crs does not reproject data; use st_transform
#> for that

# geographic vs. projected CRS
ft_wgs84 = st_transform(ftFeatures, 4326)
# now assigning the same CRS again, does not lead to a warning message
st_crs(ft_wgs84) = 4326

Created on 2018-10-22 by the reprex package (v0.2.1)

Could be that sf does not complain if you are "overwriting" a geographic CRS with the same geographic CRS but does complain if you do so with a projected CRS but this is just educated guessing (see also last three lines of the reproducible example) but @edzer will surely know!

I don't get the second warning you get: st_crs<- : replacing crs does not reproject data; use st_transform for that

only the one dealing with GDAL. For safety, i've added a simple control on crs ab5e1f2, in case, and overwrite it only if no crs is found. Can you please reinstall, test and let me know, normally you should not get the 2d warning, but the 1st warning will probably remain...

st_crs<- will emit a warning if the crs assigned is different from the one present. Equality between two CRS is determined by GDAL, and does not always what we think it should do.

Hi,
I tried to use this package in different cases and compliments is a very good work.
At the moment I have a problem with the connection of all the layers in https://data.lter-europe.net/geoserver/deims/wfs

wfseLTER <- "https://data.lter-europe.net/geoserver/deims/wfs"
wfs <- ows4R::WFSClient$new(
  url = wfseLTER, 
  serviceVersion = "1.0.0",
  logger = "INFO"#,
  # user = "",
  # pwd = ""
)
capseLTER <- wfs$getCapabilities()
centroidsDEIMSAllSites <- capseLTER$findFeatureTypeByName("deims:deims_all_sites", exact = TRUE)
centroidsFeature <- centroidsDEIMSAllSites$getFeatures()
Error in if (element$getType() == "geometry") { : 
  argument is of length zero

any solution of that?

Hi @oggioniale I've fixed the issue through #37
Can you reinstall the package from Github and retest your code? It should work now

thanks! It work now.