biodiv / anycluster

Server-side clustering of map markers for (Geo)Django

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Compatibilty with mysql

jelenak opened this issue · comments

Hello,
P
We want run Your project on MySQL, but there are a lot of PostgreSQL-specific SQL in the code. Is there any chance to run this project on mysql too? :)

I took a look at the mysql spatial extensions and can say yes, it is possible, but with limitations (e.g. centroids instead of kmeans). I cannot say how fast it will be.
Nevertheless, this should be done in a fork like anycluster-mysql. We can't use Django's ORM for at least some queries as it does not cover the operations we need in these cases.
I anticipate the porting to mysql not to be much work - but as I am currently very busy with the next anycluster release I currently don't have the time to port it. If you need it very soon and you are willing to try to port it I can assist.

I just committed a larger update. I recommend merging with this.

I already merged yours new commits.
Maybe you can help with k-means intergration in mysql? Or give me some hints. :)

I will do all another placeses where is postgres injections. And I also think that for other people it will be very helpfull fork.

Thenks,
Jelena

I will send you a query for a pin based clustering on mysql. You will need mysql's spatial extension. Furthermore, from what I have read, spatial indexing (which is important for clutering) on INNODB tables are only available since mysql server version 5.7.4. Before that release, only MyISAM tables supported spatial indexing.

Thank You very much. We will be waiting.

A first try would be replacing the kmeans query with this:

cursor = connections["default"].cursor()

cursor.execute('''SELECT count(*), Boundary(GeometryCollection(%s)) FROM (SELECT %s FROM %s WHERE Within(%s, GeomFromText('%s') )) AS hull''' % (geom_column_str, geom_column_str, geo_table, geom_column_str, geos.wkt) )

cluster = cursor.fetchone()

count = cluster[0]
geos_cluster = GEOSGeometry(cluster[1])

cluster_coordinates = geos_cluster.centroid

You also should set the gridSize default to 128 or smaller. (The kmeans algorithm creates cells. As we do not use the real kmeans algorithm we let the MapClusterer.py create the cells. Reducing gridSize will increase the amount of cells).

If you do not have Boundary, try Envelope().

Boundary will be more precise than Envelope. Let me know if you could make it work.

I just set up a mysql spatial environment
Unfortunately, the above does not work because GeometryCollection works different from what I expected. Youy will have to use something like this:

SELECT count(*), AsText( mysql.ST_Centroid(coordinates)) FROM geom_table
WHERE coordinates IS NOT NULL AND Within( coordinates , GeomFromText( 'wkt' ) ) 

Furthermore, Boundary doesn't seem to be implemented. I think I found a solution for mysql using https://github.com/krandalf75/MySQL-Spatial-UDF . I am trying to get it running with mysql now.

please continue with issues here

https://github.com/biodiv/anycluster-mysql

Thank You very much! We will continue this project.