coreylowman / dfdx

Deep learning in Rust, with shape checked tensors and neural networks

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Question / clarification regarding heap allocations

emchristiansen opened this issue · comments

Asking here because I didn't see this explicitly covered in the docs.

In what conditions can a user know that a given piece of dfdx code won't perform any heap allocations?
E.g., will I be good if i simply ensure all my tensors have static shapes?

I don't have a perfect apples-to-apples comparison, but I have the impression my dfdx code is maybe 2x to 4x slower than equivalent code written in JAX and compiled to XLA.
In the XLA version, the full data flow graph including the tensor shapes is statically known, so there's no need for dynamic allocation, and I'm wondering if that might be the source of the apparent difference in speed.

Yeah it's likely due to allocations - the only thing dfdx does for reducing allocations is for unary/binary operations where one of the inputs only has 1 owner & the output has the same shape/strides as the input. In this case dfdx will reuse the input allocation for the output. But if the input has multiple owners or a different shape it can't do that. In every other case there will be an allocation

If you want to add details about this to the docs feel free to open a PR! Probably good at the mod level tensor documentation