Automatically select a backend

Question

Automatically select a backend

prbzrg opened this issue 6 months ago · comments

Hossein Pourbozorg commented 6 months ago

It would be great to have an API to automatically select a backend. It can be a function that try each backend and return the fastest workable one.
One of its usages could be in https://github.com/SciML/SciMLSensitivity.jl/blob/master/src/concrete_solve.jl

Guillaume Dalle · Answer 1 · Tue Apr 09 2024 17:23:01 GMT+0800 (China Standard Time)

DifferentiationInterfaceTest.jl already provides the necessary utilities for users to compare and benchmark backends.
I think doing it in their place would be a step too far, and extremely costly in terms of runtime. What we can do however is define utilities to list available backends, in order to make benchmarking even simpler

Hossein Pourbozorg · Answer 2 · Tue Apr 09 2024 18:40:23 GMT+0800 (China Standard Time)

Selecting only happens once, before optimization starts. So for big optimizations, the time cost is negligible.
And if it's a bad idea, what about having a function that doesn't try the backends but select based on properties like mutating or having branches?

Guillaume Dalle · Answer 3 · Tue Apr 09 2024 18:59:10 GMT+0800 (China Standard Time)

My reasoning was: it's very costly and it's a two-liner, so we better let the user do it themselves. However, I guess we could expose an interface of the form:

function fastest_backend(backends, scenario)
    results = benchmark_differentiation(backends, scenario)
    best_trial = argmin(trial -> trial.time, results)
    return best_trial.backend  # currently doesn't work but only needs minor modifs
end

Is that similar to what you had in mind?

Guillaume Dalle · Answer 4 · Tue Apr 09 2024 18:59:48 GMT+0800 (China Standard Time)

As for a heuristic to select backends, I think benchmarking is indeed more reasonable. We have internal traits to check whether mutation is supported though, we could expose them

Guillaume Dalle · Answer 5 · Tue Apr 09 2024 19:00:41 GMT+0800 (China Standard Time)

As for listing the available backends, I thought some more and it's not obvious what the right method is. I can check whether ForwardDiff.jl is loaded, but then what is my "prototypical" ForwardDiff backend object: how many chunks does it have? Same for ReverseDiff and compiled tape.

Adrian Hill · Answer 6 · Tue Apr 09 2024 19:10:15 GMT+0800 (China Standard Time)

I see how this would be useful for

users who want a very high-level interface that abstracts away as much as possible
downstream packages that don’t know which functions will be passed to their interface and want to use runtime heuristics for maximum performance (e.g. the SciML ecosystem, as you mentioned)

Both Guillaume and I favor explicitness in DI and try to avoid macros and generated code.
Something I could envision, that is very close to Guillaume's suggestion, is a thin wrapper around DifferentiationInterfaceTest.jl's test_differentiation that returns a backend according to an optimality criterium (runtime, allocs, ...) selected by the user.

The following would allow you to take the human reading a table of benchmarks out of the loop:

backend = autobackend(GradientScenario(f; x=x))

for 1:100000
    grad = gradient(f, backend, x)
    # ...
end

This autobackend function could be called by downstream packages to define default configurations for e.g. subsequent solver calls.

Adrian Hill · Answer 7 · Tue Apr 09 2024 19:11:04 GMT+0800 (China Standard Time)

I see I wrote too slowly. 😄

Adrian Hill · Answer 8 · Tue Apr 09 2024 19:20:00 GMT+0800 (China Standard Time)

I can check whether ForwardDiff.jl is loaded, but then what is my "prototypical" ForwardDiff backend object: how many chunks does it have? Same for ReverseDiff and compiled tape.

I personally would be ok with such a high-level function being "suboptimal", as long as advanced users can manually specify to benchmark several ForwardDiff backends with different chuck sizes.
In this specific case, we could default to the pickchunksize heuristics used by ForwardDiff.

Hossein Pourbozorg · Answer 9 · Tue Apr 09 2024 21:10:39 GMT+0800 (China Standard Time)

Is that similar to what you had in mind?

I'm sure, it would be helpful for new AD users, but what I wish for is a AutoAuto() or AutoAuto(list_of_backends) backend that in runtime select a backend and use memoization for future calls.

Vaibhav Kumar Dixit · Answer 10 · Wed Apr 10 2024 13:02:05 GMT+0800 (China Standard Time)

I think implementing this would make more sense for downstream packages that take DI as a dep. Since this automatic selection is heavily influenced by the problem type you have.

Guillaume Dalle · Answer 11 · Wed Apr 10 2024 14:19:26 GMT+0800 (China Standard Time)

Adrian and I are both against magic tricks like memoization, so if we do offer this functionality it will be a separate choice function, not a backend object with a hidden mechanism. But at the moment it doesn't fit well within our benchmark framework so I would leave it to downstream users