q-learning reinforcement-learning sailing sailing-simulator machine-learning

AI Learns to Sail

Code for the blog post "AI learns to sail upwind" on the ML&Neuro Blog.

A Q-learning implementation using average reward of the sailing upwind task. Tackles two tasks: open sea and sailing in the channel. The open sea just requires to select the appropriate angle to sail along. The channel task requires the agent to learn how to tack.

Demo

A live demo is available at DEMO.

Install

Clone the project.

git clone https://github.com/PPierzc/ai-learns-to-sail.git

open the cloned repository in a terminal

Tasks

There are currently 2 tasks: sailing in the open sea and in a channel. In both tasks the goal is to reach as far upwind as possible.

Open Sea

The first task is sailing on the open sea. Here you can go wherever you want since there are no obstacles. Below is a visualization of the task. Blue lines show random trajectories, while the black shows a converged trajectory.

Run open sea task

To run the task execute:

python3 ./tasks/open_sea.py

You should get something like the result below.

Running: Random agent on open sea task: 100%|██████████| 100/100 [00:00<00:00, 275.73it/s]
Running: Train agent on open sea task: 100%|██████████| 500/500 [00:07<00:00, 67.93it/s]
Results from last 100 episodes
| ===== agent ===== | ===== mean ===== | ===== std ===== |
| Random            | 52.52            | 35.16           |
| Trained           | 107.89           | 10.36           |

Channel

The second task involves sailing in a tight channel. Following a straight line is no longer the optimal solution as it would lead to a crash with a side pier. Instead the agent has to learn how to tack to avoid getting destroyed.

Below you will find a visualisation of the task along with a random solution (blue) and an optimal solution (black).

Run channel task

To run the task execute:

python3 ./tasks/channel.py

You should get something like the result below.

Running: Random agent on channel sea task: 100%|██████████| 100/100 [00:00<00:00, 602.28it/s]
Running: Train agent on channel sea task: 100%|██████████| 1000/1000 [00:11<00:00, 86.73it/s]
Results from last 100 episodes
| ===== agent ===== | ===== mean ===== | ===== std ===== |
| Random            | 19.86            | 11.88           |
| Trained           | 63.94            | 19.47           |

License

Distributed under the MIT License

Contribution and Ideas

If you have any cool ideas that you think this agent should try then share them as an issue or create a pull request.

About

⛵AI that learns to sail upwind using Q-learning and average reward.

q-learning reinforcement-learning sailing sailing-simulator machine-learning

MIT License

Languages

Language:Python 100.0%