DRL algorithm with api

Question

DRL algorithm with api

eightreal opened this issue 3 months ago · comments

Hello , Dear Contributors
I notice that the application DQN don't use the api .h file.
And there only exists defined loss function, so if I want to develop a DQN methods, I would like to ask you to confirm the following.

Is there an interface or method to customize the Los function?
Can I copy the header file you used in Aplication / DRL, and if so, which release package should I use? nntrainer-devel?

Or you have better advice.

taos-ci · Answer 1 · Mon Apr 01 2024 19:52:54 GMT+0800 (China Standard Time)

cibot: Thank you for posting issue #2528. The person in charge will reply soon.

MyungJoo Ham · Answer 2 · Wed Apr 03 2024 14:52:23 GMT+0800 (China Standard Time)

Example: https://github.com/nnstreamer/nntrainer/blob/main/Applications/Custom/mae_loss.cpp
Yes, you can. A devel package is always recommended, too, if you want to setup a CI/CD system.

eightreal · Answer 3 · Mon Apr 08 2024 14:43:14 GMT+0800 (China Standard Time)

Another question.
When I call the run interface and save the model, do I also save the current training status (such as gradient information)? Is it possible to continue training after the model is loaded in the future.

Jubilee.Yang · Answer 4 · Mon Apr 08 2024 15:35:58 GMT+0800 (China Standard Time)

Hello! Thank you for your question and concern. Here're my answer on your questions:
First, you can save the model after training. However, it does not support to save the gradient information.
Second, Yes. it is possible to continue training after the model is loaded.

MyungJoo Ham · Answer 5 · Thu Apr 11 2024 18:42:19 GMT+0800 (China Standard Time)

You can do checkout and continue training process, but that's just not based on gradient saving.
You can do epoch-based checkpointing (that's what most nntrainer's mobile applications do), but I'm not sure about finer-grained checkpointing.

eightreal · Answer 6 · Fri Apr 12 2024 13:46:05 GMT+0800 (China Standard Time)

ok, thanks for your reply ,
another question , is there any method for a model copy and Polyak update?

MyungJoo Ham · Answer 7 · Fri Apr 19 2024 22:48:49 GMT+0800 (China Standard Time)

For model copy, if there is no copy-constructor for model class and the default behavior does not do what you want, you may try "original.save()" and "cloned.load()".

For Polyak update, it appears that the DQN application (or simple "reinforcement learning" app) has its own "custom" op. But I'm not too sure about this. I guess @jijoongmoon may answer this when he returns from trip.

eightreal · Answer 8 · Mon May 13 2024 10:30:32 GMT+0800 (China Standard Time)

Hello, I checked the reinforcement learning app ,
you update the net by save file and load file, but not polyak update ,
could you help check it ?
And if if there is a impl of polyak update, could you help clear its path and code line?