ucbrise / actnn

ActNN: Reducing Training Memory Footprint via 2-Bit Activation Compressed Training

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Does this work for any model with activation function as relu?

kailashg26 opened this issue · comments

Hello, I'm trying to use the actnn with maddpg (an MARL algorithm). The model just has 3 layers with activation function relu. If so, can you let us know if this mechanism will give the results with smaller models.

Thank you.

Link of maddpg https://github.com/marlbenchmark/off-policy/tree/release/offpolicy/algorithms/maddpg

It should work with relu activation function, but we didn't test any RL tasks. Did you meet memory issues even with this small model?

I mean, when I train the maddpg, almost 10GB of memory gets used. So, I wanted to try some compression functions. It would be a great help if you can provide some insights on how to test it with the maddpg if you have any idea.

Thanks

You can try to follow the usage and replace the layers in your model with actnn layers. You can start with higher bits and see whether the lossy compression hurts reward.

Thank you. I'll look into it and post here if I have any doubts