pytorch deep-reinforcement-learning multi-agent-reinforcement-learning maddpg

horna-cuevas-rally

Two deep-reinforcement learning agents that play tennis, trained using the Multi-Agent Deep Deterministic Policy Gradient (MADDPG) algorithm.

Project Details:

Horna and Cuevas are deep-reinforcement learning agents designed for a tailored version of the Tennis environment, from the Unity ML-Agents Toolkit.

Each agent perceives a state represented via a vector of 24 elements. This vector contains information of the position and velocity of the ball and their racket. Their actions are composed by vectors of 2 real-valued elements between -1 and 1. These values represent horizontal and vertical movements.

Each agent is rewarded with +0.1 points every time they pass the ball over the net. If the ball does not cross the net or falls outside the court, the offending agent is penalised with -0.01 points. We consider the environment solved when the maximum score among both agents reaches 0.5 on average, over 100 episodes.

Getting Started

Before running the agents, be sure to accomplish this first:

Clone this repository.
Download the Tennis environment appropriate to your operating system (available here ).
Place the environment file in the cloned repository folder.
Setup an appropriate Python environment. Instructions available here.

Instructions

You can start running and training the agents by exploring Tennis.ipynb. Also available in the repository:

tennis_agent.py contains the agents' code.
tennis_manager.py has the code for training the agents.

About

Two deep-reinforcement learning agents that play tennis.

pytorch deep-reinforcement-learning multi-agent-reinforcement-learning maddpg

Languages

Language:Jupyter Notebook 64.4%Language:Python 35.6%