Less memory-consuming xml parsing

Question

Less memory-consuming xml parsing

dbuse opened this issue 9 years ago · comments

Currently the whole xml-result is first parsed into a xml.etree.ElementTree and than processed to create overpy structures. While this is perfectly fine for small amounts of data, larger files or requests consume a lot of memory that is not freed after the overpy result is constructed.

A SAX-style parser could reduce the memory footprint and both overpy's architecture and osm_xml's structure would easily support such a parser.

PhiBo · Answer 1 · Mon May 04 2015 05:08:27 GMT+0800 (China Standard Time)

Looks like some people have to work with very large datasets. So using the SAX parser might be a good idea. I have scheduled this feature for the next version.

domlysz · Answer 2 · Sat Jul 16 2016 22:49:29 GMT+0800 (China Standard Time)

It's also possible to use iterparse function of ElementTree module: memory footprint and run speed are similar to SAX parser bit it's less verbose.

domlysz@a18ae32?diff=unified

Maybe it can be a good choice if you want to maintain only one parser. I can make pull request if you're interesting.