edzer / sp

Classes and methods for spatial data

Home Page:http://edzer.github.io/sp/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

disaggregate() does not seem to add the necessary "comment" attribute to identify holes if Polygons does not contain exactly 1 polygon with 0 or 1 holes

thebat137 opened this issue · comments

I am working with a geographic dataset with complex borders. The data is stored as a SpatialPolygonsDataFrame. Each Polygons object in the data frame may be composed of multiple disjoint polygons, and each polygon may have multiple holes. I want to disaggregate each of the Polygons objects into its individual disjoint polygons and construct an adjacency matrix for all the disjoint components, and I was using disaggregate() to do this. However, when I run gTouches() on the disaggregated data, in order to compute the adjacencies, I get a number of warnings like this:

Error in RGEOSBinPredFunc(spgeom1, spgeom2, byid, "rgeos_touches") : Polygons object missing comment attribute ignoring hole(s). See function createSPComment.

Looking at the Polygons "comment" attributes in the SpatialPolygonsDataFrame output by disaggregate(), I see that the only comment values are "0" (indicating a single polygon with no holes), "0 1" (indicating a single polygon with a single hole), and NULL (apparently no comment was written). Since I know my dataset contains several Polygons objects which are composed of multiple disjoint regions, also several Polygons which contain more than one hole, this is not the expected result. In reading the disaggregate() code in the repository here (i.e. explodePolygons()), I also can't see where the comment is being added for the cases where a Polygons object has more than two parts or more than two holes. It actually seems like it's getting carried along almost accidentally in the few cases that do get comments, and neglected otherwise.

Assuming I'm not failing to understand the code and the desired behavior (entirely possible, as I am new at working with this software!), this seems suboptimal to me. My dataset is pretty well-behaved (despite its complexity), so I should be able to fix my issues with judicious application of createPolygonsComment. But I had a heck of a time figuring out what was going wrong with gTouches, since Polygons comment management appears to be a pretty obscure field (and createSPComment wasn't working for me, for whatever reason). So it seems like it might be better if disaggregate() just parses and passes along the comments from its input correctly, or, if it's absolutely necessary to not create comments, passes nothing and warns clearly in the manual that comments and associated hole information are being lost. Passing along comments in some cases but not in others seems like kind of the worst of both worlds.

I've attached a set of tests I wrote to demonstrate desired/undesired behavior: test.txt. My R version is 3.4.0, my sp version is 1.2-4, my rgeos version is 0.3-23 (SVN revision 546), and my GEOS runtime version is 3.5.1-CAPI-1.9.1 r4246. I am using Debian Release 9.0 with kernel version 4.9.0-2-amd64. Hope this helps; please let me know if you need more info.

Thanks; the hole and comment hell of sp was one of the motivations to start from scratch in package sf. There, you can do

> nc = read_sf(system.file("gpkg/nc.gpkg", package="sf"))
> nc %>% nrow
[1] 100
> nc %>% st_cast("POLYGON") %>% nrow
[1] 108
Warning message:
In st_cast.sf(., "POLYGON") :
  repeating attributes for all sub-geometries for which they may not be constant

disaggregate was contributed by Robert Hijmans; you may want to contact him directly, or send your issue to r-sig-geo.

Thanks, @edzer. I have posted this issue to the r-sig-geo mailing list here. In the meantime, sounds like I should check out sf. :)

See PR #28

PR #29 works for both the toy test cases and the actual data. Thanks, @rsbivand and @edzer!