Pontryagin Maximum principle to solve a contol problem with the python module torchdiffeq.
The script lqr.py
parametrises the policy \alpha in the linear-quadratic control problem
that minimises the utility function
where C,D,R
are positive semi-definite matrices of size (d,d)
where d
is the dimension of the process X_t
, and \alpha_s is a neural network that we optimise to minimise the utility function.
usage: lqr.py [-h] [--base_dir BASE_DIR] [--device DEVICE] [--use_cuda]
[--seed SEED] [--batch_size BATCH_SIZE] [--d D]
[--hidden_dims HIDDEN_DIMS [HIDDEN_DIMS ...]] [--n_iter N_ITER]
[--T T] [--steps STEPS] [--step_size_solver STEP_SIZE_SOLVER]
[--visualize]
- Example Training of 2-dimensional LQR problem:
python lqr.py --d 2 --use_cuda
- Visualizing the results:
python lqr.py --d 2 --visualize