graspologic-org / graspologic

Python package for graph statistics

Home Page:https://graspologic-org.github.io/graspologic/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[BUG] graspologic takes 33 seconds to import

loftusa opened this issue · comments

Problem

Graspologic is taking an extremely long time to import for me. This is after a fresh pip install --upgrade graspologic. (Also had to pip install --upgrade numba and pip install --upgrade numpy to get it to import)

I timed it and it looks like it takes around 33 seconds, and importing it also gives some strange umap numba warning.

Screenshot 2023-05-20 at 9 19 02 PM

Example Code

Please see How to create a Minimal, Reproducible example for some guidance on creating the best possible example of the problem

from time import time
start = time()
import graspologic
end = time()

print(end - start)

Full Traceback

/usr/local/lib/python3.9/site-packages/umap_learn-0.5.3-py3.9.egg/umap/distances.py:1063: NumbaDeprecationWarning: The 'nopython' keyword argument was not supplied to the 'numba.jit' decorator. The implicit default value for this argument is currently False, but it will be changed to True in Numba 0.59.0. See https://numba.readthedocs.io/en/stable/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit for details.
  @numba.jit()
/usr/local/lib/python3.9/site-packages/umap_learn-0.5.3-py3.9.egg/umap/distances.py:1071: NumbaDeprecationWarning: The 'nopython' keyword argument was not supplied to the 'numba.jit' decorator. The implicit default value for this argument is currently False, but it will be changed to True in Numba 0.59.0. See https://numba.readthedocs.io/en/stable/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit for details.
  @numba.jit()
/usr/local/lib/python3.9/site-packages/umap_learn-0.5.3-py3.9.egg/umap/distances.py:1086: NumbaDeprecationWarning: The 'nopython' keyword argument was not supplied to the 'numba.jit' decorator. The implicit default value for this argument is currently False, but it will be changed to True in Numba 0.59.0. See https://numba.readthedocs.io/en/stable/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit for details.
  @numba.jit()
/usr/local/lib/python3.9/site-packages/umap_learn-0.5.3-py3.9.egg/umap/umap_.py:660: NumbaDeprecationWarning: The 'nopython' keyword argument was not supplied to the 'numba.jit' decorator. The implicit default value for this argument is currently False, but it will be changed to True in Numba 0.59.0. See https://numba.readthedocs.io/en/stable/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit for details.
  @numba.jit()
/usr/local/lib/python3.9/site-packages/graspologic/models/edge_swaps.py:215: NumbaDeprecationWarning: The 'nopython' keyword argument was not supplied to the 'numba.jit' decorator. The implicit default value for this argument is currently False, but it will be changed to True in Numba 0.59.0. See https://numba.readthedocs.io/en/stable/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit for details.
  _edge_swap_numba = nb.jit(_edge_swap)

Your Environment

  • Python version: 3.9.13
  • graspologic version: 3.0.0

Additional Details

This is in the graphstatsbook docker container with 7 cpus allocated and about 2/3 of my RAM on a 2022 macbook air with m2 chip.

I don't plan on working on this, but if anyone wants to speed things up, go for it.

I'd also just note that importing a specific function or class is usually pretty quick

I did some light profiling on this with python -X importtime -c 'import graspologic -- here's what came up.
import_times.txt

@bdpedigo I looked at this a bit more just now using tuna.
here's the import profile for graspologic:

Screenshot 2023-12-07 at 9 18 44 AM

appears to be mainly the umap import in graspologic.layouts.auto and ot in graspologic.align.seedless_procrustes

that's interesting! and a cool tool/visualization

im open to discussing proposed fixes, i just dont really know what could be done here, since those other libraries are out of our controll

i can tell you that i dont think we use anything under ot.backend.tensorflow ot ot.backend.torch... so if there's some way to turn off those imports perhaps that could be a big save?

i wonder why the load time is so much shorter for tuna than you, though?

i wonder why the load time is so much shorter for tuna than you, though?

no clue, I noticed that too, how long does it take for you?

that's interesting! and a cool tool/visualization

im open to discussing proposed fixes, i just dont really know what could be done here, since those other libraries are out of our controll

i can tell you that i dont think we use anything under ot.backend.tensorflow ot ot.backend.torch... so if there's some way to turn off those imports perhaps that could be a big save?

throw imports inside of functions maybe? makes those functions take "longer" to run, but shorter for anybody who just wants to import the package

PythonOT/POT#516 i wonder to what extent your issue is related to this? what version of POT are you on? it sounds like the root cause is tensorflow, do you have tensorflow installed in this environment?

i guess another question - is there a reason you are needing to import all of graspologic, if you're saying you dont want some of these functions? might be much faster to just import the function(s) you need

i wonder why the load time is so much shorter for tuna than you, though?

no clue, I noticed that too, how long does it take for you?

that's interesting! and a cool tool/visualization
im open to discussing proposed fixes, i just dont really know what could be done here, since those other libraries are out of our controll
i can tell you that i dont think we use anything under ot.backend.tensorflow ot ot.backend.torch... so if there's some way to turn off those imports perhaps that could be a big save?

throw imports inside of functions maybe? makes those functions take "longer" to run, but shorter for anybody who just wants to import the package

does this import stick around? are you paying the cost only the first time? if so, this seems totally reasonable to me, but if you add 33 seconds every time you try to save your graph layout, it's going to be a bit wonky. doesn't mean there won't be other ways to fix it, just that this specific one may not work.