planetlabs / gpq

Utility for working with GeoParquet

Home Page:https://planetlabs.github.io/gpq/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

More flexibility for convert?

cholmes opened this issue · comments

The new convert stuff works great. It seemed it just handles WKB, would be great if it could handle WKT as well.

The other great addition would be to enable it to use an alternate geometry column by supplying the column name - often data isn't named 'geometry'.

@cholmes - can you prove a parquet file with a WKT geometry column that doesn’t work with convert? (The tests cover this case, but I believe there could still be issues to fix.)

Ah cool, so it's supposed to work? I'll dig in again - I tried one but I was changing a lot of variables so there's some chance I like didn't have the column named geometry or something else was off, and I didn't see in the CLI feedback that WKT was supported. Will report back with either a parquet file with WKT that's not working or confirmation that it does. Thanks!

Ok, test file is at https://storage.googleapis.com/open-geodata/ch/ookla_wkt_small.parquet

Looking at it I'm thinking it might be because the column is a 'varchar'? Versus like a string or something? The error I got was gpq: error: trouble reading geometry as WKB: wkb: invalid data, which is probably why I was thinking that it was just wkb.

The original file I was working with was from https://registry.opendata.aws/speedtest-global-performance/ - just took theirs and then tried to rename the 'tile' column to 'geometry', as it looked like WKT. Can get a copy of the original file I was working with at https://storage.googleapis.com/open-geodata/ch/2019-01-01_performance_fixed_tiles.parquet (it's like 200mb, the one above I made by just only selecting 200 rows after changing the geometry column name.

My steps in duckdb:

create table ookla_wkt AS select quadkey, tile AS geometry, avg_u_kbps, avg_lat_ms, tests, devices FROM '2019-01-01_performance_fixed_tiles.parquet';
copy (select * from ookla_wkt LIMIT 200) to 'ookla_wkt_small.parquet' (FORMAT PARQUET);