Aircraft Trajectory Optimisation and Control using Reinforcement Learning

This repository explores the use of deep reinforcement learning in solving the multi-agent aircraft trajectory optimisation problem for cargo drone networks. Specifically, the Deep Q Network (DQN) framework is adopted and trained on a three-dimensional state space that represents a congested urban environment with dynamic obstacles. Through formalising a Markov Decision Process (MDP), various flight and control parameters are varied between training simulations to study their effects on agent performance. Both fully observable MDPs (FOMDPs) and partially observable MDPs (POMDPs) are formulated to understand the role of shaping reward signals on training performance.

While conventional trajectory optimisation techniques are evaluated based on path length or time, I aim to incorporate economic analysis by considering tangible and intangible sources of cost, such as the cost of energy, the Value of Time (VOT) and the Value of Reliability (VOR). By comparing outcomes from an integration of multiple cost sources, we are better able to gauge the impact of various parameters on efficiency.

To further explore the feasibility of cargo drone networks, the trained agents are also subjected to the multi-agent Point-to-Point and Hub-and-Spoke network environments. In these simulations, delivery orders are generated using a discrete event simulator with an arrival rate, which is varied to investigate the effect of travel demand on economic costs.

Simulation results point to the importance of signal engineering, as reward signals play a crucial role in shaping reinforcements. The results also reflect an increase in costs for environments where congestion and arrival time uncertainty arise due to the presence of other agents in the network. These results play an integral role in shaping the development of future aerial cargo drone networks, as overall economic costs can be minimised by selecting optimal values based on the generated results.

RongjianLiang / aircraft_trajectory_optimisation

Aircraft Trajectory Optimisation and Control using Reinforcement Learning

About

Languages