to run DQN training
python -m dqn.train
to start tensorboard during training
tensorboard --logdir model_outputs
- calibrate the actions so that they are well distributed targets
- add a single no-move target location
- LSTM policy
- ConvLSTM
Generalized Advantage Estimation- Proximal Policy Optimization
- Population-based optimization