lisphilar / covid19-sir

CovsirPhy: Python library for COVID-19 analysis with phase-dependent SIR-derived ODE models.

Home Page:https://lisphilar.github.io/covid19-sir/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[Bug] importing covisirphy is slow because importing sklearn when "import covsirphy"

lisphilar opened this issue · comments

Checkbox

Summary

covsirphy.Evaluator uses sklearn.metrics and importing this takes about 5 seconds. sklearn is used only in Evaluator and rewriting metrics codes with numpy may reduces the importing time.

Reproducible example script

time python -c "import covsirphy"
time python -c "import sklearn"
time python -c "import numpy"

The current outputs

time python -c "import covsirphy"

real    0m18.736s
user    0m3.424s
sys     0m3.629s

$ time python -c "import sklearn"

real    0m5.610s
user    0m1.097s
sys     0m1.856s

$ time python -c "import numpy"

real    0m0.854s
user    0m0.214s
sys     0m0.643s

Expected outputs

`time python -c "import covsirphy"` takes 13 seconds (=18 - 5) at in my environment.

Environment

- CovsirPhy version: 3.0.0-dev
- Python version: 3.10.7

Package manager (required if installation issue)

poetry

Platform (required if installation issue)

Ubuntu

Additional Context

No response

import numpy as np
    # Metrics: {name: (function(x1, x2), whether smaller is better or not)}
    _METRICS_DICT = {
        "ME": (lambda x1, x2: np.max(np.abs(x2 - x1)), True),
        "MAE": (lambda x1, x2: np.mean(np.abs(x2 - x1)), True),
        "MSE": (lambda x1, x2: np.mean(np.square(x2 - x1)), True),
        "MSLE": (lambda x1, x2: np.mean(np.square(np.log1p(x2) - np.log1p(x1))), True),
        "MAPE": (lambda x1, x2: np.mean(np.abs((x2 - x1) / x1)) * 100, True),
        "RMSE": (lambda x1, x2: np.sqrt(np.mean(np.square(x2 - x1))), True),
        "RMSLE": (lambda x1, x2: np.sqrt(np.mean(np.square(np.log(x2 + 1) - np.log(x1 + 1)))), True),
        "R2": (lambda x1, x2: np.corrcoef(x1, x2)[0, 1]**2, False),
    }

In-effective on importing time because sklearn is used in anather class for PCA.

Currently, make importtime command (#1290) shows
image

With #1291,

  • use numpy instead of sklearn.metrics in Evaluator class
  • import autots and pca inside methods because they have sklearn as a dependency, making importing time slow

image

In my environment at this time,

  • before: 27 seconds in total
  • after: 11 seconds in total