geoxarray / geoxarray

Geolocation utilities for xarray

Home Page:https://geoxarray.github.io/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Project Status

djhoese opened this issue · comments

A lot of things have been changing in the scientific python ecosystem regarding geolocated data. Based on some of these changes, I consider this project to be in limbo until other projects settle on what they support and can do. Basically, I'm not sure I can dedicate time to this project when I know that large chunks of it may change in the near future (2019). Below I'll describe things I'm aware of. If you know of any other activities by other related projects that may affect this project, please comment below. Feel free to ask questions here as well.

Pangeo

@rabernat of the Pangeo community has been gathering information from various "gridding" packages in the python community. The beginning of this discussion is in pangeo-data/pangeo#356 and has since continued over video conference meetings and the Pangeo gitter. The goal of these conversations has been to get all of the various uses of "grids", what they are, and how they are defined. If efforts from the various available packages can be combined, then they should be. @rabernat is planning a blog post on the topic. Hopefully that can be linked here when it is complete.

IMO geoxarray is one avenue that could be taken to have one package that assists all of these efforts. However, my original goals for geoxarray and the type of data is was meant to support are simpler than the user cases I've heard of so far through these discussions.

PROJ versus WKT versus Other

The PROJ C library, pyproj, rasterio, libgeotiff, and gdal have been seeing a lot of changes regarding how a coordinate reference system (CRS) can be represented, where these representations get analyzed, and what the best defaults for libraries and users are. If I recall correctly @snowman2 was a big proponent of using WKT (Well Known Text) over the PROJ.4 string (or I guess it should just be "PROJ" now). If PROJ string do not properly/completely describe a CRS then it may be best to use something else that does. If the community (see libraries above) are leaning towards using WKT over PROJ then perhaps geoxarray should use that internally.

These libraries, or maybe just PROJ, have transitioned to different types of objects (CRS maybe?) to represent a CRS. If this can serve every purpose that I wanted the pycrs project to serve then I don't see a reason to use pycrs except for the "pure python" aspect. If everyone else is using PROJ or some other low-level library then so should geoxarray. I've been using PROJ strings for the last 8 or so years so I'm open to suggestions from others and feedback on anything I'm misunderstanding.

2D dimensions in xarray

There have been some efforts to better support 2D dimensions/coordinates in xarray. This would be a big deal for geoxarray for ungridded/non-uniform data (2D longitude/latitude arrays). From what I've been pointed to through the pangeo discussions I see:

Now while this doesn't affect gridded data, which should be using Affine-like descriptions or 1D x/y dimensions, this still has a big impact on what people can do with xarray and what I would want geoxarray to be able to do.

</braindump>

This is to present both the format conversion capabilities of pyproj.CRS and an example of information loss converting to PROJ strings from WKT projection strings:

>>> from pyproj import CRS
>>> crs = CRS("epsg:4326")
>>> crs
<CRS: epsg:4326>
Name: WGS 84
Ellipsoid:
- semi_major_metre: 6378137.00
- semi_minor_metre: 6356752.31
- inverse_flattening: 298.26
Area of Use:
- name: World
- bounds: (-180.0, -90.0, 180.0, 90.0)
Prime Meridian:
- longitude: 0.0000
- unit_name: degree
- unit_conversion_factor: 0.01745329
Axis Info:
- Geodetic latitude[Lat] (north) EPSG:9122 (degree)
- Geodetic longitude[Lon] (east) EPSG:9122 (degree)

>>> CRS(crs.to_proj4())
<CRS: +proj=longlat +datum=WGS84 +no_defs +type=crs>
Name: unknown
Ellipsoid:
- semi_major_metre: 6378137.00
- semi_minor_metre: 6356752.31
- inverse_flattening: 298.26
Area of Use:
- UNDEFINED
Prime Meridian:
- longitude: 0.0000
- unit_name: degree
- unit_conversion_factor: 0.01745329
Axis Info:
- Longitude[lon] (east) EPSG:9122 (degree)
- Latitude[lat] (north) EPSG:9122 (degree)

>>> CRS(crs.to_wkt())
<CRS: GEOGCRS["WGS 84",DATUM["World Geodetic System 1984 ...>
Name: WGS 84
Ellipsoid:
- semi_major_metre: 6378137.00
- semi_minor_metre: 6356752.31
- inverse_flattening: 298.26
Area of Use:
- name: World
- bounds: (-180.0, -90.0, 180.0, 90.0)
Prime Meridian:
- longitude: 0.0000
- unit_name: degree
- unit_conversion_factor: 0.01745329
Axis Info:
- Geodetic latitude[Lat] (north) EPSG:9122 (degree)
- Geodetic longitude[Lon] (east) EPSG:9122 (degree)

More information about pyproj.CRS is at https://pyproj4.github.io/pyproj/html/api/crs.html

Hopefully helpful.

I'm going to close this issue. The original purpose was to define the initial design of geoxarray, but I've now released the 0.1.0 version so the initial design is at least a little complete. No need for this to hang around in my opinion.