number of database queries
clime opened this issue · comments
There is a significant performance hit for making SELECT for each cell in gridCluster
and kmeansCluster
. Have you thought about reducing it to just one (or a few) queries? I am not completely sure that it is possible but I feel it should be and it would improve performance greatly (especially if you have lots of cells). Have you thought about it? I am looking for a way to do it but I would like to hear from you first what you think.
For the kmeans
method, this should be possible and is an interesting idea. One would have to calculate the number of visible cells and then get the number of clusters with k*cellcount
. After that, only one SELECT
would be needed, targeting the current (grid)bounds instead of each grid cell. Furthermore, this would reduce the amount of times the distance cluster has to be run. I will give that a try. Thank you for this input.
For the gridCluster
I currently don't know how the amount of SELECT
could be reduced, but that does not mean it is not possible. If you (or anyone else) knows a solution it would be highly appreciated.
I might have found a way querying the database only once by using a grid calculated by a postgis function:
http://gis.stackexchange.com/questions/16374/how-to-create-a-regular-polygon-grid-in-postgis
Hopefully I will find the time to test this.
query amount reduced using temporary tables
Good job. I can't test because i am on travels but good job.