Refactoring: Introduce a FSM for OSM file parsing
lukasalexanderweber opened this issue · comments
Through #2907 I remembered a thing I noticed during the implementation of via way turn restrictions, namely the complexity and coupling of WaySegmentParser and OSMReader. Having a short glance on the code today I thought it might be worth bringing up my idea on a refactoring. However I only want to start implementing this if you're interested.
From #2673:
All that being said, I'm not strictly against changing anything about WaySegmentParser and OSMReader and there absolutely is room for improvement. Especially OSMReader is still a somewhat inflexible huge monolith of many different things. The builder in WaySegmentParser is also not ideal, which is something I wanted to fix before, but either forgot about it or ran into some problem, I can have a look again. So feel free to open a PR for your refactorings that we can discuss. Shall we do this first? I can try to re-arrange the code in the way that I think would be appropriate for the new turn restriction handling in the meantime.
I created the refactoring branch but never did a PR since we were too busy with the TR.
The idea is build around a FSM represented by a state-transition table implemented in OSMParser
With this Parser we can parse an OSM file as often as we want, and can specify what exactly should be done with a Way, Relation etc. - If no handler for a specific element is specified it just skips those.
This gives us a nice mental model on the steps we do to create the graph:
Parse1 get Header info and preprocess Ways and Relations
OSMParser parse1 = new OSMParser()
.setFileheaderHandler(new FileHeaderHandler())
.setWayHandler(getWayPreprocessor())
.setRelationHandler(getRelationPreprocessor());
parse1.readOSM(osmFile, workerThreads);
Parse2 process Nodes, Ways and Relations
OSMParser parse2 = new OSMParser()
.setNodeHandler(getNodeHandler())
.setWayHandler(getWayHandler())
.setRelationHandler(getRelationHandler());
parse2.readOSM(osmFile, workerThreads);
of course there is room for improvement but I thought it would be sad if my thought are lost without discussing them again :)