Improve exponential implicit search space

Question

Improve exponential implicit search space

pheymann opened this issue 6 years ago · comments

Currently, we experience an exponential growth of compile times with increasing API sizes. Taking a look at compiler statistics shows that we waste a lot of time during implicit resolution (as expected):

API:

:= :> Query[String]("test1") :> Query[String]("test4") :> Query[String]("test2") :> Query[String]("test3") :> Put[Json, User]

Statistics (-Ystatistics:typer):

time spent in implicits       : 276 spans, 1635ms (90.1%)
   successful in scope         : 129 spans, 1159ms (63.9%)
   failed in scope             : 147 spans, 1372ms (75.6%)
   successful of type          : 49 spans, 1554ms (85.6%)
   failed of type              : 98 spans, 1190ms (65.6%)
   assembling parts            : 144 spans, 59ms (3.3%)
   matchesPT                   : 11475 spans, 172ms (9.5%)
time spent in macroExpand     : 216 spans, 1351ms (74.4%)

The time to do the macro expansions significantly increases with the number of elements. My best guess why that is happening are the following actions:

Lazy expansion during TypelevelFoldLeft derivation
repeated Witness derivation (API type creation and during RequestDataBuilder/RouteExtractor derivation

We also face the problem of multiple passes of the API type. But that should "only" add a constant factor to the compile time.

An indicator that the recursive Lazy resolution during the TypelevelFoldLeft derivation could be the main problem is given by the fact that Witness derivation during API-to-type transformation needs considerable less time:

time spent in implicits       : 28 spans, 139ms (27.5%)
   successful in scope         : 0 spans, 0ms (0.0%)
   failed in scope             : 28 spans, 88ms (17.5%)
   successful of type          : 6 spans, 38ms (7.7%)
   failed of type              : 22 spans, 11ms (2.2%)
   assembling parts            : 28 spans, 3ms (0.6%)
   matchesPT                   : 430 spans, 10ms (2.0%)
time spent in macroExpand     : 5 spans, 24ms (4.9%)

To get a complete picture I need to profile the compiler. But before doing that I will start by implementing a low hanging fruit which is fusing the different derivation passes. Instead of having a fold, request data, etc. pass I will fuse them into a single pass. That way I should get rid of the recursive Lazy and spare me the repeated iterations.

If the problem still persists I have to invest more time and take a deeper look.

Paul Heymann · Answer 1 · Wed May 22 2019 17:34:48 GMT+0800 (China Standard Time)

I think this problem can be solved by merging the different transformation stages into a single one. That means, instead of having a HList of ApiElements which we transform into a tuple storing ApiOps and Witnesses we directly create the tuple. Thus we would get rid of TypeLevelFoldLeft and the corresponding computations.

Furthermore, we could take a look into implicit caching to reduce the number of Witness derivations.

Until this problem isn't solved, I will mark this project/repository as experimental.