wondolee / ahah

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

GPU accelerated routing with RAPIDS

This repository contains the python code used to find distances to various health related points of interest from each postcode within Great Britain, as well as the code used to map COVID-19 vaccine accessibility.

Mapping Inequalities in COVID-19 Vaccine Accessibility

This analysis considers drive-time as an alternative to the eudlidean distance used to assess accessibility by the NHS. While euclidean distance gives some indication of access, it does not take into account either the road network distance or travel time, both of which may be significantly longer in different geographic locations. Rural areas for example have narrow winding roads, which are incomparable to the equivalent distance travelled by motorway.

Access here is defined through the average time-weighted road network distance for each postcode within an MSOA to the nearest vaccination site. For this, the road highways network and road speed estimates provided through Ordnance Survey was used, alongside the ONS Postcode Directory for May 2020, which gives centroids for every postcode in the country. Vaccination sites were accessed through NHS England for the 26th of March 2021.

This is a very computationally intense calculation, with the total road network used having 5,062,741 edges, and 4,289,045 nodes. The single-source shortest path algorithm was used to determine the time-weighted network distance from all 1,463,696 postcodes in England to their nearest vaccination site.

This calculation was made possible through the GPU accelerated Python library cugraph, part of the NVIDIA RAPIDS ecosystem (https://rapids.ai), allowing the computation to be highly parallel, taking minutes, rather than days.

Routing Workflow

OS Highways

  • Speed estimates given to each road, based on formOfway and routeHierarchy
  • Time-weighted distance calculated using length of edge and speed estimate
  • Node ID converted to sequential integers and saved with edges as parquet files

Process Data ahah/process_data.py

This stage prepares the nodes, postcodes, and poi data for use in RAPIDS cugraph. This stage also makes use of utility functions to assist with data preparation from the raw data sources.

  • Clean raw data
  • Find the nearest road node to each postcode and point of interest using GPU accelerated K Means Clustering
  • Determine minimum buffer distance to use for each point of interest
    • Distances returned for nearest 10 points of interest to each postcode using K Means
    • For each unique POI the maximum distance to associated postcodes is taken and saved as a buffer for this POI
    • Each POI is assigned the postcodes that fall within their KNN, used to determine buffer suitability when converted to a graph
  • All processed data written to respective files

Routing ahah/routing.py

The routing stage of this project primarily makes use of the RAPIDS cugraph library. This stage iterates sequentially over each POI of a certain type and finds routes to every postcode within a certain buffer.

  • Iterate over POI of a certain type
  • Create cuspatial.Graph() with subset of road nodes
    • Use cuspatial.points_in_spatial_window with buffer to obtain subset
  • Run single-source shortest path from POI to each node in the sub graph
    • cugraph.sssp takes into account weights, which in this case are the time-weighted distance of each connection between nodes as reported by OSM.
  • SSSP distances subset to return only nodes associated with postcodes, these distances are added iteratively to a complete dataframe of postcodes of which the smallest value for each postcode is taken

Basic documentation may be found at https://cjber.github.io/ahah/ahah

AHAH Data Sources

See https://figshare.com/articles/online_resource/Access_to_Healthy_Assets_and_Hazards_AHAH_-_Updated_version_2017/8295842/1 for v2 information.

About


Languages

Language:Python 98.1%Language:Dockerfile 1.0%Language:Makefile 0.9%