georust / wkt

Rust read/write support for well-known text (WKT)

Home Page:https://crates.io/crates/wkt

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Reconsider design of core structures

frewsxcv opened this issue · comments

Right now, it's possible to model this with the current design:

LineString(vec![
    Coord { x: 1, y: 1, None, None },
    Coord { x: 1, y: 1, Some(1), None },
])

...which translates to something like:

LINESTRING ( 1 1, 1 1 1 )

...which doesn't make any sense since the first item has two dimensions but the second item has three dimensions.

We can change the core structures to a different design to enforce all items having the same dimensions:

(Keep in mind, this design is not final and is just a rought draft)

pub trait Dimension {
    fn from_quad(x: f64, y: f64, z: Option<f64>, m: Option<f64>) -> Self;

    fn to_quad(&self) -> (f64, f64, Option<f64>, Option<f64>);
}

pub struct Xy(f64, f64);
impl Dimension for Xy {
    fn from_quad(x: f64, y: f64, _: Option<f64>, _: Option<f64>) -> Self {
        Xy(x, y)
    }

    fn to_quad(&self) -> (f64, f64, Option<f64>, Option<f64>) {
        (self.0, self.1, None, None)
    }
}

pub struct Xyz(f64, f64, f64);
impl Dimension for Xyz {
    fn from_quad(x: f64, y: f64, z: Option<f64>, _: Option<f64>) -> Self {
        Xyz(x, y, z.unwrap())
    }

    fn to_quad(&self) -> (f64, f64, Option<f64>, Option<f64>) {
        (self.0, self.1, Some(self.2), None)
    }
}

pub struct Xym(f64, f64, f64);
impl Dimension for Xym {
    fn from_quad(x: f64, y: f64, _: Option<f64>, m: Option<f64>) -> Self {
        Xym(x, y, m.unwrap())
    }

    fn to_quad(&self) -> (f64, f64, Option<f64>, Option<f64>) {
        (self.0, self.1, None, Some(self.2))
    }
}

pub struct Xyzm(f64, f64, f64, f64);
impl Dimension for Xyzm {
    fn from_quad(x: f64, y: f64, z: Option<f64>, m: Option<f64>) -> Self {
        Xyzm(x, y, z.unwrap(), m.unwrap())
    }

    fn to_quad(&self) -> (f64, f64, Option<f64>, Option<f64>) {
        (self.0, self.1, Some(self.2), Some(self.3))
    }
}

// Concrete types

pub enum Point<D: Dimension> {
    Empty,
    Value(D),
}

pub enum LineString<D: Dimension> {
    Empty,
    Value(Vec<D>),
}

pub enum Geometry<D: Dimension> {
    Empty,
    Point(Point<D>),
    LineString(LineString<D>),
    // ...
    GeometryCollection(Vec<Geometry<D>>),
}

playpen

commented

The above change does not correspond to the wtk specification correctly as the Dimension is not sufficient to specify the allowed types. For example for points we can have:

  • Point
  • PointZ
  • PointM
  • PointZM

Both the Z and M version are dimension 3. Instead I think want there to be 4 fundamental coord types. Basically any use of Option for Z or M should be avoided.

pub struct Coord<T>{
    pub x: T,
    pub y: T,
}
pub struct CoordZ<T>{
    pub x: T,
    pub y: T,
    pub z: T,
}
pub struct CoordM<T>{
    pub x: T,
    pub y: T,
    pub m: T,
}
pub struct CoordZM<T>{
    pub x: T,
    pub y: T,
    pub z: T,
    pub m: T,
}

So a line string would become:

pub enum LineString<CT: CoordType> {
    Empty,
    Value(Vec<CT>),
}

@grivy Your proposed design looks good to me. What does the CoordType trait look like?

commented

hmm..
So actually while I do think this library should provide data types as default, ideally I would not want to use them as I do not want to copy my data into these temporary structures and then to WKT.

Like money, coordinates should not be stored in floating points, as we want to be able to compare and we do not want to loose precision. In one project I have lat/lon in i64's (need to divide by (lat_i64 / 8192 / 60 / 60) to get the floating point value. And another project in a local Cartesian reference frame is in mm (y_i32 / 1000) to get float value.

So while I think it is good to provide default structs it should also be possible to use the users' own Coordinate type for direct translation to WKT. So the user would have to implement the CoordZ trait and then implement the x(), y(), z() functions returning floats.

Actually when writing to WKT (unlike WKB) there should not be any need to loss precision due to float representation. For example the y_i32 can be written to a string first, and then have the decimal point inserted later 654356 -> "654356" => "654.356". This would allow the user to prevent any loss in precision.

Or is there a maximum to the number of decimals in the WKT standard?

For this scenario I suppose the user would implement a trait for example CoordM and provide the functions: x_as_string(), y_as_string(), m_as_string() oh and x_from_string(), y_from_string(), m_from_string()

Not sure yet what the implications are on LineString and user provided LineString structs.

Maybe this library can provide some inspiration:
https://docs.rs/postgis/0.4.0/postgis/ewkb/index.html
Note that this is WKB so they have to do i64

What about not having a wkt-specific internal representation and instead having LooksLikeALinestring, LooksLikeAPoint traits and a Wkt trait that provides the wkt(&self) -> String function implemented for every type that implements such traits. That way the user can represent their types however they want and get a nice WKT without much effort.

And of course impls for every type in geo_types...

Although I'm not sure how parsing would look like in this scenario

You might also be interested in https://github.com/georust/geozero

Which allows things like:

geojson.to_wkt()

I looked at it when I was trying to find options but by reading the readme I didn't get the idea of how it would help in my case. But you're right, it might be helpful

I'm still kind of getting the hang of GeoZero myself, but I think the idea is that you can implement:

impl GeozeroGeometry for YOURTYPE or impl GeozeroDatasource for YOURTYPE

and then you can serialize your type to any of the supported output formats (wkt, geojson, gdal, etc.)