reinforcement-learning-environments differential-games multi-agent-reinforcement-learning

Differential Game Gym

The repository contains examples of finite-horizon zero-sum differential games implemented as environments (Markov games) for multi-agent reinforcement learning algorithms. Since the problems are initially described by differential equations, in order to formalize them as Markov games, a uniform time-discretization with the diameter dt is used. In addition, it is important to emphasize that, in the games with a finite horizon, agent's optimal policies depend not only on the phase vector $x$, but also on the time $t$. Thus, we obtain Markov games, depending on dt, with continuous state space $S$ containing states $s=(t,x)$ and continuous action space $A$.

Interface

The finite-horizon zero-sum differential games are implemented as environments (Markov games) with an interface close to OpenAI Gym with the following attributes:

state_dim - the state space dimension;
u_action_dim - the action space dimension of the first agent;
v_action_dim - the action space dimension of the second agent;
terminal_time - the action space dimension;
dt - the time-discretization diameter;
reset() - to get an initial state (deterministic);
step(u_action, v_action) - to get next_state, current reward, done (True if t > terminal_time, otherwise False), info;
virtual_step(state, u_action,v_action) - to get the same as from step(action), but but the current state is also set.

About

Examples of finite-horizon zero-sum differential games implemented as environments for reinforcement learning algorithms

reinforcement-learning-environments differential-games multi-agent-reinforcement-learning

Apache License 2.0

Languages

Language:Python 100.0%