gdalle / DifferentiationInterface.jl

An interface to various automatic differentiation backends in Julia.

Home Page:https://gdalle.github.io/DifferentiationInterface.jl/DifferentiationInterface

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Automatically select a backend

prbzrg opened this issue · comments

It would be great to have an API to automatically select a backend. It can be a function that try each backend and return the fastest workable one.
One of its usages could be in https://github.com/SciML/SciMLSensitivity.jl/blob/master/src/concrete_solve.jl

DifferentiationInterfaceTest.jl already provides the necessary utilities for users to compare and benchmark backends.
I think doing it in their place would be a step too far, and extremely costly in terms of runtime. What we can do however is define utilities to list available backends, in order to make benchmarking even simpler

Selecting only happens once, before optimization starts. So for big optimizations, the time cost is negligible.
And if it's a bad idea, what about having a function that doesn't try the backends but select based on properties like mutating or having branches?

My reasoning was: it's very costly and it's a two-liner, so we better let the user do it themselves. However, I guess we could expose an interface of the form:

function fastest_backend(backends, scenario)
    results = benchmark_differentiation(backends, scenario)
    best_trial = argmin(trial -> trial.time, results)
    return best_trial.backend  # currently doesn't work but only needs minor modifs
end

Is that similar to what you had in mind?

As for a heuristic to select backends, I think benchmarking is indeed more reasonable. We have internal traits to check whether mutation is supported though, we could expose them

As for listing the available backends, I thought some more and it's not obvious what the right method is. I can check whether ForwardDiff.jl is loaded, but then what is my "prototypical" ForwardDiff backend object: how many chunks does it have? Same for ReverseDiff and compiled tape.

I see how this would be useful for

  • users who want a very high-level interface that abstracts away as much as possible
  • downstream packages that don’t know which functions will be passed to their interface and want to use runtime heuristics for maximum performance (e.g. the SciML ecosystem, as you mentioned)

Both Guillaume and I favor explicitness in DI and try to avoid macros and generated code.
Something I could envision, that is very close to Guillaume's suggestion, is a thin wrapper around DifferentiationInterfaceTest.jl's test_differentiation that returns a backend according to an optimality criterium (runtime, allocs, ...) selected by the user.

The following would allow you to take the human reading a table of benchmarks out of the loop:

backend = autobackend(GradientScenario(f; x=x))

for 1:100000
    grad = gradient(f, backend, x)
    # ...
end

This autobackend function could be called by downstream packages to define default configurations for e.g. subsequent solver calls.

I see I wrote too slowly. 😄

I can check whether ForwardDiff.jl is loaded, but then what is my "prototypical" ForwardDiff backend object: how many chunks does it have? Same for ReverseDiff and compiled tape.

I personally would be ok with such a high-level function being "suboptimal", as long as advanced users can manually specify to benchmark several ForwardDiff backends with different chuck sizes.
In this specific case, we could default to the pickchunksize heuristics used by ForwardDiff.

Is that similar to what you had in mind?

I'm sure, it would be helpful for new AD users, but what I wish for is a AutoAuto() or AutoAuto(list_of_backends) backend that in runtime select a backend and use memoization for future calls.

I think implementing this would make more sense for downstream packages that take DI as a dep. Since this automatic selection is heavily influenced by the problem type you have.

Adrian and I are both against magic tricks like memoization, so if we do offer this functionality it will be a separate choice function, not a backend object with a hidden mechanism. But at the moment it doesn't fit well within our benchmark framework so I would leave it to downstream users