Sometimes there is a need to group (cluster) geo points. In addition, coordinates of geo points can be obtained in the form of flat text and in various formats.
The application offers:
- text parsing to search for pairs of geo coordinates and several typical formats
- automated clustering of geo points
- presentation of clusters of points and outlayers on the map and in tabular form with an estimate of the geometric size of the cluster.
- the solution works as a web server whether on a local computer or based on a Docker container
- Docker container can be run on external hosting, for example, Heroku
The application interface is represented by five tabs:
Параметри (Parameters)
- task of input parameters and dataМапа (Map)
- presentation of the result on the mapТекст (Text)
- presentation of the result in text and tabular formsДопомога (Help)
- actually this text is an explanation of what you are reading nowПриклад (Example)
- text containing a set of geo points for testing the application
Input parameters and data consist of three groups: Розборка тексту на координати (Parsing)
, Кластеризація точок (Clustering)
and Відображення (Rendering)
.
The parsing the text into coordinates accepts the following representations of geo coordinates in the text for their automated processing:
Id | Description | Example |
---|---|---|
0 | Decimal point and comma between values | qwerty (44.6857, 33.56173); (44.68599 ,33.56555); ... |
1 | Decimal point and slash between values | Lorem ipsum 46,7369/ 32,8103 adsc 46,7373 /32,812 ei ... |
2 | Decimal point and space between values | АБВГД 48.01388 38.796 07:59 АГД 48.0141 38.9056 07:53 ... |
Also, the sequence of coordinate notation in different sources may differ, namely: [Широта (Latitude), Довгота (Longitude)]
or [Довгота, Широта]
.
Clustering of points is carried out using the DBSCAN method, which has two main intuitively clear parameters:
epsilon
- or neighborhood distance is the radius / distance (m) within which points are considered neighborsmin_samples
- the minimum number of points to form a cluster
Part of the points, that may not fall into any cluster, form a separate group. These points are called outlayers
(i.e. outside clusters).
To estimate the geometric size of the cluster, the diameter of the circle (around the central point) that contains the specified part of the points (in %) is calculated.
You can also set the starting scale of the map display here.
- test the clustering algorithm for critical situations (run time errors), install stubs and generate error messages, ensure the stability of the application, centrally process messages about critical situations and generate a pop-up window (pywebio toast or popup)
- complicate parsing: extract the description of the point from the text: (left, right, target in a separate line)
- deploy a Docker container on external hosting
- install Heroku at the computer, login and enter the following command from CLI
heroku stack:set container -a cluster-b
, where the latter is the app name
- install Heroku at the computer, login and enter the following command from CLI
- Qingkai's Blog: Clustering with DBSCAN
- Folium and Geopandas: FeatureGroup for categorial data | Florian Neukirchen
- Heroku + Docker in 10 Minutes. | by Kay Jan Wong | Towards Data Science
- How to Build and Deploy a Container in Heroku Using a heroku.yml File - YouTube
- How to Deploy a Python Flask App on Heroku Using Docker - DEV Community 👩💻👨💻