PAHdb / pyPAHdb

A Python tool to decompose astronomical PAH emission into contributing PAH subclasses.

Home Page:https://www.astrochemistry.org/pahdb/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Add code profiling to identify matrix operation bottlenecks

mattjshannon opened this issue · comments

Would be useful to see if we can save some time during the matrix operations. Apparently the cProfile package is recommended, though I will look into this.

With running example_fits.py, the performance hit is really due to writing out the PDF with matplotlib. Possibly there are some improvements that can be done there_?_ The example_fits.py takes 1:36 on my system with writing the PDF, 4 seconds when not writing the PDF-file ...

There appear to be several options to speed-up the PDF production. While there are several alternative plotting packages, they all come with other requirements, e.g., Qt. On the other hand, several people suggest to reuse axis, etc. to speed up Matplotlib.

[update] Might be good to put an example output image(s)/plot(s) in the README and usage ...

It looks like snakeviz works quite well for profiling/visualizing the results (https://jiffyclub.github.io/snakeviz/).

It's installable via pip or conda, and you basically just use it thusly...

python -m cProfile -o profiling_results.prof example_fits.py
snakeviz profiling_results.prof (opens in your browser)

Interesting. I've installed the package and ran the profiling. Unfortunately I'm not getting the nice graphs---it looks like it doesn't like space in path names ... Though, the bottleneck we have is still Matplotlib. You were mentioning before that reusing the axis for each plot could speed-up the PDF-output. My current thoughts are that moving away from Matplotlib might not be such a good idea given its widespread use among the community ...