Project-OSRM / osrm-backend

Open Source Routing Machine - C++ backend

Home Page:http://map.project-osrm.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

OSRM contract running for a long time

wieringen opened this issue · comments

I have been running osrm router for over 1.5 years. Every night a cron runs that fetches the geofabrik
osm file for the netherlands. I filter out the bicylce networks using osmfilter and feed the result to osrm. Osrm is using the bicycle profile.
Today I updated osrm from 5.6.0 to 5.12.0. What I noticed was a large increase in parsing time. Is this normal?
cpu usage

*** Update 20:30 ***
I just killed the process. What normally takes +/- 2 hours is now taking more than 8 and it was still running.

What I noticed was a large increase in parsing time

No that seems weird. Are you using the default bicycle profile or something custom?

The profile I used is slightly modified:
I added this to the properties returned in function setup()

 --weight_name                = 'duration',
 weight_name                   = 'distance',

I added this to function process_turn(profile, turn)

 -- for distance based routing we don't want to have penalties based on turn angle
 if profile.properties.weight_name == 'distance' then
    turn.weight = 0
 else
    turn.weight = turn.duration
 end

Oh and I'm using the official docker image (osrm/osrm-backend) where I enabled stxxl by creating a file called .stxxl in /opt. Is this still correct?

.stxxl contents:

disk=/var/tmp/stxxl,7G,syscall

@wieringen: Did you try this? Did this help?

--weight_name = 'duration',
weight_name = 'distance',

if profile.properties.weight_name == 'distance' then
turn.weight = 0
else
turn.weight = turn.duration
end

Oops sorry maybe I wasn't clear. I have always used those modification in my profile. So I ported them straight away after the upgrade to the v2 profile. Maybe I can try it again with a default profile..

I tried running 5.12.0 again with a default bicycle profile I had the same result... I killed the process after it was clear that it was no way near finishing and twice the normal time had elapsed.

cpu usage

I noticed that stxxl is no longer enabled in 5.12.0. How do I enable stxxl when using the osrm docker image? Maybe that will make a difference.

Pass -DENABLE_STXXL=On to CMake during compilation from source.

option(ENABLE_STXXL "Use STXXL library" OFF)

Ahh ok hmm that means I will have to fork the osrm-backend project and roll my own osrm docker image. I was hoping to avoid that... but ok I will try it to see if it makes any difference.

You can use our Dockerfile and always enable stxxl here

https://github.com/Project-OSRM/osrm-backend/blob/master/docker/Dockerfile#L39

for testing and then see how things go.

What also occurred to me you could try: Use our MLD based toolchain instead of contraction. It seems like you use a distance weight which is notoriously bad for CH based approaches.

Running:

./osrm-partition file.osrm
./osrm-customize file.osrm
./osrm-routed -a MLD file.osrm

Should be much faster and memory efficient. Downside is that queries are going to be a little slower (mostly a problem with the /table plugin).

Maybe it would be good that someone explain the tools for users not knowing exactly what these tools do.
@TheMarex Does this also work instead using osrm-contract? What is MLD and CH?

@tds4u it is a different algorithm for computing shortest paths. To use it instead of the old toolchain:

osrm-extract data.osm.pbf -p profile.lua
osrm-contract data.osrm
osrm-routed data.osrm

You simply run:

osrm-extract data.osm.pbf -p profile.lua
osrm-partition data.osrm
osrm-contract data.osrm
osrm-routed data.osrm -a MLD

Hi there @TheMarex, I have also been running osrm-contract using the car profile for over 16 hours on the geofabrick north-america-latest map. This does not seem normal, but it is the first time I am using it. I have not altered any of the profiles and am using the latest docker containers. Do you suggest I also try MLD or might there be another reason for this? The terminal output thus far is below:

docker run -t -v $(pwd):/data osrm/osrm-backend osrm-contract /data/north-america-latest.osrm
[info] Input file: /data/north-america-latest.osrm
[info] Threads: 12
[info] Reading node weights.
[info] Done reading node weights.
[info] Loading edge-expanded graph representation
[info] merged 204576 edges out of 316286086
[info] initializing node priorities... ok.
[info] preprocessing 70189256 (89.5005%) nodes...
[info] . 10% . 20% . 30% . 40% . 50% . 60% .[renumbered] 70% . 80% . 90% 
[info] Getting edges of minimized graph . 10% . 20% . 30% . 40% . 50% . 60% . 70% . 80% . 90% . 100
[info] initializing node priorities... ok.
[info] preprocessing 8161621 (10.4071%) nodes...
[info] [renumbered]. 10% .

@ShaunHoward I also used north-america-latest from geofabric (8GB) file. I use m4.4xlarge EC2 instance. It took me about 3 hours (183min) to do osrm-contract step. I didn't use docker for that one. I compiled it. Sorry, I don't have an answer for your question.

@xydinesh, @TheMarex I was able to rerun using the new steps via osrm-partition and osrm-customize within about 3 hrs. I am now successfully running the backend with MLD planning.

I tried to build 5.12.0 myself however it keeps complaining about a incorrect lua version.. I'm going to try and build HEAD and if that doesn't work I will forgot about stxxl and try to use the alternative algorithm.

Thursday I tried again to update to 5.12.0. This time trying to parse with the MLD approach.

cpu usage

As you can see no succes... Again it took a really long time and eventually I just killed it. I have a big feeling it is because of stxxl however building the docker image is not as straight forward as I hoped It just gives a error and exits. I tried building the docker image for 5.12.0 and HEAD I get different errors with each.

I'm running the docker image on a coreos Digitalocean droplet . It has 4 vCPU's, 8 gigs of RAM and a 80 GB SSD. The geofabrik osm file i'm using is netherlands-latest.osm.pbf (1022mb). From which I first extract the bicycle network using osmfilter. Which leaves me with a fraction of the original data.

@wieringen memory consumption for netherlands-latest.osm.pbf with the bicycle profiles and the distance weight does not exceed 3G with or without STXXL for 910ee08

Memory consumption in MB for

  • ./osrm-extract -p ../profiles/bicycle.lua netherlands-latest.osm.pbf

netherlands_extract

  • ./osrm-contract netherlands-latest.osrm without STXXL

netherlands_contract

  • ./osrm-contract netherlands-latest.osrm with STXXL

netherlands_contrac_stxxl

  • ./osrm-partition netherlands-latest.osrm

netherlands_partition

  • ./osrm-customize netherlands-latest.osrm

netherlands_customize

@wieringen one thing you could try is to not use the filtering step you do first. From the bounds you mention and the measurements @oxidase just confirmed, I would assume that there might just be something tripping up OSRM after the filtering step. If you could confirm whether/or not this happens on the unmodified latest extract from geofabrik, we can at least rule out whether osmfilter is involved or not.

Ahhh ok! With the filtering I was trying to make it easier for osrm not harder 😅
I will try it out and let you know! Btw this is the filter I use:

osmconvert ${OSM_PBF_FILE} -o=${O5M_FILE}
osmfilter ${O5M_FILE} --verbose --drop="building= or source=3dShapes or amenity!=ferry_terminal or route=train or landuse= or leisure= or natural=" --keep-nodes="*" -o=${OSM_BICYCLE_FILE}

I removed the filter still no luck.. I ran it from 9:00 to 15:30 and because there was no end in sight I just killed it.. My code is now dead simple. I don't know what to try anymore.

#!/bin/sh
echo "Downloading NL osm data"
curl -o ${OSM_PBF_FILE} ${OSM_PBF_URL}

echo "Parsing NL bicycle network"
osrm-extract -p /opt/bicycle.lua ${OSM_PBF_FILE}
osrm-partition ${OSRM_FILE}
osrm-contract ${OSRM_FILE}

@wieringen one thing I notice here is that there could theoretically be some interaction with the partitioning. The step of osrm-partition is only needed if you intend to use osrm-customize instead of osrm-contract. You could also try to remove that step. It seems that @oxidase wasn't able to reproduce the running time locally, but I will also give it a try.

Success! I changed osrm-contract to osrm-customize and now it runs blazingly fast. Seems to finish in 1/4 of the previous time. Now checking if the queries I run are ok.

#!/bin/sh
echo "Downloading NL osm data"
curl -o ${OSM_PBF_FILE} ${OSM_PBF_URL}

echo "Parsing NL bicycle network"
osrm-extract -p /opt/bicycle.lua ${OSM_PBF_FILE}
osrm-partition ${OSRM_FILE}
osrm-customize ${OSRM_FILE}

@wieringen OSRM contains two "acceleration algorithms":

osrm-extract+osrm-contract = CH (Contraction Heirarchies). Slow to pre-process, but queries are very fast on the generated data.

osrm-extract+osrm-partition+osrm-customize = MLD (Multi-Level Dijkstra). Faster to pre-process, but queries are slower than CH (perhaps 2x or 3x slower)

After data processing, the routing results should be identical, except for query speed.

Hi guys, I have the same issue with the osrm contract being very slow with osrm/osrm-backend:v5.18.0 , using default profiles for car and bicycle. Just a month ago I processed the same maps with osrm/osrm-backend:v5.5.2 in 3.5 hrs (extract + contract), now extract took 3hrs and contract runs for 19hrs and is still processing, htop shows that the processes run and 40 cores are all used up. I need to stick to CH to have the queries run faster. Anything I can do to speed up the process or is it just the current state of things with the new version? What has changed so drastically? Thanks!

would be cool to add timestamps to the log entries stating the percentage of processing :

[info] Threads: 40
[info] Reading node weights.
[info] Done reading node weights.
[info] Loading edge-expanded graph representation
[info] merged 226490 edges out of 156748704
[info] initializing node priorities... ok.
[info] preprocessing 39110065 (100%) nodes...
[info] .
 10% 
.
 20% 
.
 30% 
.
 40% 
.
 50% 
.
 60% 

@TattiQ The percentage output is deliberately done on newlines (as seen in your example output) so that you can do something like unbuffer osrm-contract filename.osrm | ts to prepend timestamps in a format of your choosing using a separate tool.

oh thanks @danpat , got it! btw my osrm-contract running for 44 hrs now, I am not giving up..

One new feature introduced after 5.5.2 that you could disable is the excludes in the Lua profile - if you are using the default car.lua file, then osrm-contract will run for a lot longer than if you eliminate the excludes.

@danpat what should I do to reap this benefit of disabling the excludes? I need CH -- I'm matching millions of points in batch, and I'd rather not have that batch run 2 to 3 times slower.

Is this build parallelizable? I have access to a cluster.

@retorquere osrm-contract is highly parallel at the CPU level, but there's currently no way to split the data across multiple memory spaces and process in chunks - the CH algorithm as written assumes the entire graph is accessible. Lots of CPUs on a single box will go faster if you can achieve that.

To remove the unnecessary excludes, modify car.lua and change the excludable sequence to be empty -

excludable = Sequence {

This will speed up running of osrm-contract - but otherwise, it has no effect on actual query time.

The cluster I can run this on has 16 cores per node, so even a single node would be better than my local system. Thanks for the tip on the excludes.

Can the build be interrupted and restarted? I can get access to a high-performance node but I can only get it for short bursts.

@retorquere No, it's not designed to be interruptible unfortunately.

An alternative if you're struggling would be to use the MLD algorithm (via osrm-partition/osrm-customze instead of osrm-contract). Pre-processing is a lot faster, but routing is not quite as quick.

@danpat my car.lua does not have any excludable clauses in it, so I guess this is eliminated. anything else I could do to speed up the process?

how does merging maps into one file affect the CH processing? I mean if we take the map of Sweden and the map of Greece and a bunch of other countries separately and do

osmconvert {{ country }}.osm.pbf  -o={{ country }}.o5m
osmconvert *.o5m -o=allmaps.osm.pbf

and then perform extract and contract on the allmaps.osm.pbf file. would that be slower than just processing a map of e.g. Europe? I noticed that processing maps per country is much faster .

commented

Hello, I running a osrm contract on the midwest.pbf data on a Digitial Ocean VM that has 32 Cores and 64 GB. This my first time doing this. I'm using docker. Been running now for 70 hrs.. My CPU has running at Max the whole time. Is this expected? Here's my output.

Post update.. So the final time for the record was 79 hrs for the contract process to complete.

@TattiQ - Did your attempt ever complete after 44 hours?

[root@temp-orsrm-midwest osrm_midwest]# docker run -t -v $(pwd):/data osrm/o srm-backend:v5.16.4 osrm-contract /data/us-midwest-latest.osrm
[info] Input file: /data/us-midwest-latest.osrm
[info] Threads: 32
[info] Reading node weights.
[info] Done reading node weights.
[info] Loading edge-expanded graph representation
[info] merged 27008 edges out of 88297558
[info] initializing node priorities... ok.
[info] preprocessing 19681229 (89.6314%) nodes...
[info] . 10% . 20% . 30% . 40% . 50% . 60% .[renumbered] 70% . 80% .
[info] Getting edges of minimized graph . 10% . 20% . 30% . 40% . 50% . 60% . 70 % . 80% . 90% . 100%
[info] initializing node priorities... ok.
[info] preprocessing 2271187 (10.3433%) nodes...
[info] [renumbered]. 10% . 20% . 30% . 40% . 50% . 60% . 70% . 80% . 90% . 100%
[info] Getting edges of minimized graph . 10% . 20% . 30% . 40% . 50% . 60% . 70 % . 80% . 90% . 100%
[info] initializing node priorities... ok.
[info] preprocessing 2271187 (10.3433%) nodes...
[info] [renumbered]. 10% . 20% . 30% . 40% . 50% . 60% . 70% . 80% . 90% . 100%
[info] Getting edges of minimized graph . 10% . 20% . 30% . 40% . 50% . 60% . 70 % . 80% . 90% . 100%
[info] initializing node priorities... ok.
[info] preprocessing 2271187 (10.3433%) nodes...
[info] [renumbered]. 10% . 20% . 30% . 40% . 50% . 60% . 70% . 80% . 90% . 100%
[info] Getting edges of minimized graph . 10% . 20% . 30% . 40% . 50% . 60% . 70 % . 80% . 90% . 100%
[info] initializing node priorities... ok.
[info] preprocessing 2271187 (10.3433%) nodes...
[info] [renumbered]. 10% . 20% . 30% . 40% . 50%
image

@abeck87 Your code is making progress. Just keep waiting, it'll complete eventually.

We know that the Alpine Linux-based docker images have some performance problems. The musl C library that Alpine is based on has a not-so-great malloc() implementation, which can slow some things down significantly. I've been considering re-basing our Docker images on Ubuntu so that we're using GNU libc, but I haven't got around to it.

There's nothing specifically actionable on this ticket - osrm-contract does take a long time to run.