Regression in runtime
brynpickering opened this issue · comments
Problem description
Sometime since around #487, everything has slowed down considerably. Likely it is due to some combination of dependencies and some buggy implementation we've introduced.
Pandas v2.1.1 has a known regression that is likely hitting us when we do multiindex operations (pandas-dev/pandas#55256), so we should probably pin to <=2.0.3
. However, it only explains part of the problem...
5 slowest tests pre #487:
11.82s call calliope/test/test_core_time.py::TestClustering::test_hartigans_rule
9.58s call calliope/test/test_example_models.py::TestNationalScaleClusteredExampleModelSenseChecks::test_nationalscale_clustered_example_closest_results_cbc
7.21s call calliope/test/test_example_models.py::TestModelPreproccessing::test_preprocess_time_clustering
6.74s call calliope/test/test_example_models.py::TestUrbanScaleExampleModelSenseChecks::test_urban_example_results_area[solve]
6.43s setup calliope/test/test_backend_pyomo.py::TestMILPConstraints::test_loc_techs_storage_capacity_max_purchase_milp_constraint
5 slowest tests in #487 with pandas=2.0.3:
20.23s call tests/test_example_models.py::TestNationalScaleClusteredExampleModelSenseChecks::test_nationalscale_clustered_example_closest_results_cbc[run]
15.85s call tests/test_core_time.py::TestClustering::test_hartigans_rule
15.41s setup tests/test_io.py::TestIO::test_save_netcdf
15.34s call tests/test_example_models.py::TestModelPreproccessing::test_preprocess_time_clustering
14.74s call tests/test_example_models.py::TestNationalScaleExampleModelSenseChecks::test_nationalscale_example_results_glpk
5 slowest tests post #487 with pandas=2.0.3:
==================================================================== slowest 5 durations =====================================================================
33.98s call tests/test_cli.py::TestCLI::test_run_from_netcdf
31.62s call tests/test_example_models.py::TestNationalScaleClusteredExampleModelSenseChecks::test_nationalscale_clustered_example_closest_results_cbc[run]
29.08s call tests/test_core_time.py::TestClustering::test_hartigans_rule
22.66s call tests/test_example_models.py::TestNationalScaleClusteredExampleModelSenseChecks::test_nationalscale_clustered_example_closest_results_cbc[solve]
20.40s call tests/test_backend_latex_backend.py::TestMathDocumentation::test_string_return[build_valid-tex-\n\\documentclass{article}]
5 slowest tests post #487 with pandas=2.1
41.35s call tests/test_example_models.py::TestNationalScaleClusteredExampleModelSenseChecks::test_nationalscale_clustered_example_closest_results_cbc[run]
28.90s call tests/test_example_models.py::TestModelPreproccessing::test_preprocess_time_clustering
23.05s call tests/test_core_time.py::TestClustering::test_hartigans_rule
20.40s call tests/test_example_models.py::TestNationalScaleClusteredExampleModelSenseChecks::test_nationalscale_clustered_example_closest_results_cbc[solve]
19.58s setup tests/test_io.py::TestIO::test_save_netcdf
Calliope version
v0.7.0-dev
OK, a lot of this is explained by collecting coverage by default. without coverage:
pandas=2.1.0 (all tests: 1min26)
==================================================================== slowest 5 durations =====================================================================
12.61s setup tests/test_io.py::TestIO::test_save_netcdf
10.87s call tests/test_cli.py::TestCLI::test_run_from_netcdf
10.78s call tests/test_example_models.py::TestUrbanScaleExampleModelSenseChecks::test_urban_example_results_area
10.67s call tests/test_cli.py::TestCLI::test_run_save_lp
10.60s call tests/test_example_models.py::TestNationalScaleClusteredExampleModelSenseChecks::test_nationalscale_clustered_example_closest_results_cbc[run]
pandas=2.0.3 (all tests: 1min19)
==================================================================== slowest 5 durations =====================================================================
15.85s call tests/test_core_time.py::TestClustering::test_hartigans_rule
11.32s call tests/test_cli.py::TestCLI::test_run_from_netcdf
11.05s call tests/test_example_models.py::TestNationalScaleClusteredExampleModelSenseChecks::test_nationalscale_clustered_example_closest_results_cbc[run]
9.99s setup tests/test_io.py::TestIO::test_save_netcdf
9.59s call tests/test_example_models.py::TestUrbanScaleExampleModelSenseChecks::test_urban_example_results_cap
pandas=2.1.1 (all tests: 1min27)
==================================================================== slowest 5 durations =====================================================================
11.78s call tests/test_core_time.py::TestClustering::test_hartigans_rule
10.95s call tests/test_example_models.py::TestNationalScaleClusteredExampleModelSenseChecks::test_nationalscale_clustered_example_closest_results_cbc[run]
10.54s call tests/test_example_models.py::TestModelPreproccessing::test_preprocess_time_clustering
10.02s setup tests/test_io.py::TestIO::test_save_netcdf
9.35s call tests/test_example_models.py::TestNationalScaleClusteredExampleModelSenseChecks::test_nationalscale_clustered_example_storage_inter_cluster
Looks like there is a marginal benefit to sticking with pandas=2.0.3 for now.
Still not perfect. Going back from current main in stages (skipping some commits), it looks like something was introduced in 95ce22d:
In current main (a01d4bb):
==================================================================== slowest 5 durations =====================================================================
26.80s call tests/test_core_time.py::TestClustering::test_hartigans_rule
21.20s call tests/test_example_models.py::TestNationalScaleClusteredExampleModelSenseChecks::test_nationalscale_clustered_example_closest_results_cbc[run]
16.23s call tests/test_example_models.py::TestNationalScaleExampleModelSenseChecks::test_considers_supply_generation_only_in_total_levelised_cost
16.04s call tests/test_example_models.py::TestModelPreproccessing::test_preprocess_time_clustering
13.36s call tests/test_cli.py::TestCLI::test_run_from_netcdf
In 95ce22d
==================================================================== slowest 5 durations =====================================================================
21.50s call tests/test_example_models.py::TestNationalScaleClusteredExampleModelSenseChecks::test_nationalscale_clustered_example_closest_results_cbc[run]
18.41s call tests/test_core_time.py::TestClustering::test_hartigans_rule
15.89s call tests/test_cli.py::TestCLI::test_run_from_netcdf
12.72s call tests/test_example_models.py::TestModelPreproccessing::test_preprocess_time_clustering
11.75s call tests/test_example_models.py::TestNationalScaleExampleModelSenseChecks::test_nationalscale_example_results_cbc
In bf87c30
==================================================================== slowest 5 durations =====================================================================
11.19s call tests/test_example_models.py::TestNationalScaleResampledExampleModelSenseChecks::test_nationalscale_example_results_cbc
11.10s call tests/test_cli.py::TestCLI::test_run_from_netcdf
11.09s setup tests/test_io.py::TestIO::test_save_netcdf
10.79s call tests/test_core_time.py::TestClustering::test_hartigans_rule
9.63s call tests/test_example_models.py::TestUrbanScaleExampleModelSenseChecks::test_urban_example_results_area
In 42500c7
13.43s call tests/test_core_time.py::TestClustering::test_hartigans_rule
10.90s call tests/test_cli.py::TestCLI::test_run_from_yaml
10.68s call tests/test_example_models.py::TestNationalScaleResampledExampleModelSenseChecks::test_nationalscale_example_results_cbc
9.85s call tests/test_cli.py::TestCLI::test_run_from_netcdf
9.74s call tests/test_backend_latex_backend.py::TestMathDocumentation::test_string_return[build_all-tex-\n\\documentclass{article}]
In 5591b09
==================================================================== slowest 5 durations =====================================================================
15.30s call tests/test_core_time.py::TestClustering::test_hartigans_rule
11.87s setup tests/test_io.py::TestIO::test_save_netcdf
10.25s call tests/test_example_models.py::TestNationalScaleClusteredExampleModelSenseChecks::test_nationalscale_clustered_example_closest_results_cbc[run]
10.21s call tests/test_example_models.py::TestModelPreproccessing::test_preprocess_time_clustering
9.67s call tests/test_cli.py::TestCLI::test_run_from_netcdf
profiling tests/test_example_models.py::TestNationalScaleClusteredExampleModelSenseChecks::test _nationalscale_clustered_example_closest_results_cbc[run]
suggests that the profiling in pytest is just quite volatile (or that timeseries clustering is; maybe that is the real cause of our pain):
in 95ce22d (should be 21secs according to pytest durations):
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 12.075 12.075 runner.py:111(pytest_runtest_protocol)
17/11 0.000 0.000 12.074 1.098 _hooks.py:479(__call__)
17/11 0.000 0.000 12.074 1.098 _manager.py:106(_hookexec)
17/11 0.000 0.000 12.074 1.098 _callers.py:27(_multicall)
1 0.000 0.000 12.074 12.074 runner.py:119(runtestprotocol)
3 0.000 0.000 12.074 4.025 runner.py:219(call_and_report)
3 0.000 0.000 12.074 4.025 runner.py:247(call_runtest_hook)
3 0.000 0.000 12.074 4.025 runner.py:318(from_call)
3 0.000 0.000 12.074 4.025 runner.py:262(<lambda>)
1 0.000 0.000 12.072 12.072 runner.py:160(pytest_runtest_call)
1 0.000 0.000 12.072 12.072 python.py:1790(runtest)
1 0.000 0.000 12.072 12.072 python.py:187(pytest_pyfunc_call)
1 0.000 0.000 12.072 12.072 test_example_models.py:499(test_nationalscale_clustered_example_closest_results_cbc)
1 0.000 0.000 12.072 12.072 test_example_models.py:477(example_tester_closest)
1 0.000 0.000 12.072 12.072 test_example_models.py:418(model_runner)
1 0.000 0.000 6.806 6.806 model.py:355(build)
1 0.000 0.000 6.464 6.464 model.py:376(_build)
69 0.007 0.000 6.434 0.093 backend_model.py:163(_add_component)
1 0.000 0.000 4.172 4.172 examples.py:31(time_clustering)
1 0.000 0.000 4.172 4.172 model.py:59(__init__)
In bf87c30 (should be <9secs according to pytest durations):
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 15.392 15.392 runner.py:111(pytest_runtest_protocol)
17/11 0.000 0.000 15.392 1.399 _hooks.py:479(__call__)
1 0.000 0.000 15.392 15.392 runner.py:119(runtestprotocol)
17/11 0.000 0.000 15.392 1.399 _manager.py:106(_hookexec)
3 0.000 0.000 15.392 5.131 runner.py:219(call_and_report)
17/11 0.000 0.000 15.392 1.399 _callers.py:27(_multicall)
3 0.000 0.000 15.391 5.130 runner.py:247(call_runtest_hook)
3 0.000 0.000 15.391 5.130 runner.py:318(from_call)
3 0.000 0.000 15.391 5.130 runner.py:262(<lambda>)
1 0.000 0.000 15.390 15.390 runner.py:160(pytest_runtest_call)
1 0.000 0.000 15.390 15.390 python.py:1790(runtest)
1 0.000 0.000 15.390 15.390 python.py:187(pytest_pyfunc_call)
1 0.000 0.000 15.390 15.390 test_example_models.py:499(test_nationalscale_clustered_example_closest_results_cbc)
1 0.000 0.000 15.390 15.390 test_example_models.py:477(example_tester_closest)
1 0.000 0.000 15.390 15.390 test_example_models.py:418(model_runner)
1 0.000 0.000 7.920 7.920 model.py:355(build)
1 0.000 0.000 7.518 7.518 model.py:376(_build)
69 0.006 0.000 7.486 0.108 backend_model.py:163(_add_component)
1 0.000 0.000 6.373 6.373 examples.py:31(time_clustering)
1 0.000 0.000 6.373 6.373 model.py:59(__init__)
Fixed the pandas and --cov
issues in #499