Aggregating gridded data returns multiple results per country
MariusBug opened this issue · comments
Hello everyone,
I am trying to aggregate fine-graned global climate grid data to multipolygons (first countries, later on the subnational level).
To accomplish this, I now have two shapefiles at hand, one containing the disaggregated climate data, the other the country geometries.
A glimpse at the climate data:
> head(mst_1950_sf)
Simple feature collection with 6 features and 2 fields
geometry type: POINT
dimension: XY
bbox: xmin: -179.75 ymin: 66.75 xmax: -179.75 ymax: 71.25
CRS: +proj=longlat +ellps=WGS84 +no_defs
A tibble: 6 x 3
year moisture geometry
<dbl> <dbl> <POINT [°]>
1 1950 0.689 (-179.75 71.25)
2 1950 0.752 (-179.75 68.75)
3 1950 0.765 (-179.75 68.25)
4 1950 0.638 (-179.75 67.75)
5 1950 0.811 (-179.75 67.25)
6 1950 0.728 (-179.75 66.75)
And the second shapefile:
> world2015 %>% dplyr::select(ISONAME, geometry) %>% head()
Simple feature collection with 6 features and 1 field
geometry type: MULTIPOLYGON
dimension: XY
bbox: xmin: -175.36 ymin: -21.26806 xmax: -53.98612 ymax: 12.1975
CRS: +proj=longlat +ellps=WGS84 +no_defs
ISONAME geometry
0 Guyana MULTIPOLYGON (((-58.17262 6...
1 Suriname MULTIPOLYGON (((-55.12796 5...
2 Trinidad and Tobago MULTIPOLYGON (((-61.07945 1...
3 Venezuela MULTIPOLYGON (((-66.31029 1...
4 Samoa MULTIPOLYGON (((-172.5965 -...
5 Tonga MULTIPOLYGON (((-175.1453 -...
Using sp's aggregate, I wanted to take the mean moisture of each country which seemed to work as expected:
statesAg <- aggregate(mst_1950_sf["moisture"], by = world2015$geometry, mean)
plot(statesAg)
Looked good to me - but aggregating this way drops all other variables and I don't see a way to retain other variables of interest.
> names(statesAg)
[1] "moisture" "geometry"
Thus, I merged the new data with my previous dataset to get the other variables of interest back. When checking the results, however, I noted that for many countries several "means" had been returned.
> statesAg <- st_join(world2015, statesAg)
although coordinates are longitude/latitude, st_intersects assumes that they are planar
> statesAg %>% dplyr::select(ISONAME, moisture) %>% head(n=20)
Simple feature collection with 20 features and 2 fields
geometry type: MULTIPOLYGON
dimension: XY
bbox: xmin: -175.36 ymin: -55.05167 xmax: -53.65001 ymax: 12.1975
CRS: +proj=longlat +ellps=WGS84 +no_defs
First 10 features:
ISONAME moisture geometry
0 Guyana 0.129646465 MULTIPOLYGON (((-58.17262 6...
0.1 Guyana 0.207329932 MULTIPOLYGON (((-58.17262 6...
0.2 Guyana 0.130700565 MULTIPOLYGON (((-58.17262 6...
0.3 Guyana -0.009078202 MULTIPOLYGON (((-58.17262 6...
1 Suriname 0.129646465 MULTIPOLYGON (((-55.12796 5...
1.1 Suriname 0.207329932 MULTIPOLYGON (((-55.12796 5...
1.2 Suriname -0.009078202 MULTIPOLYGON (((-55.12796 5...
2 Trinidad and Tobago 0.231250000 MULTIPOLYGON (((-61.07945 1...
3 Venezuela 0.129646465 MULTIPOLYGON (((-66.31029 1...
3.1 Venezuela 0.130700565 MULTIPOLYGON (((-66.31029 1...
What has happened here? Which mean is the actual mean of the respective countries, and which ones was the one plotted above? I would be very glad about any hint about what's been going on.
Thanks!
Are you confusing sp
and sf
here?
Yes, sorry! Looks like I mistook the syntax for sp's.