GPU -only environments and algorithms
theOGognf opened this issue · comments
One of the end goals of this project is to enable the development of RL agents with a single node GPU in minutes. This is easily possible using combination of methods from IsaacGym, RLlib, and TorchRL.
tensordict
can be passed between modules, with all operations exclusively on a GPU, and customizations that enable user defined models and action distributions.
Thinking about this, a models view requirements must all result in the same batch size when expanding the batch and time/sequence dimension together. Otherwise, it isn't clear which view requirement takes precedence and it's very difficult to determine a size that works for all view requirements. We can validate this and print out an informative error prior to any forward passes or training steps
This would probably mean having some function that spits out the size of a view requirement, and if they all aren't the same then spit out that error
We need to optionally return views from the policy sample and only apply the view requirements if they haven't been applied already