Charlie110 / mt-join-query-optimization-with-drl

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Join Query Optimization with Deep Reinforcement Learning

This repository contains the DRL-based FOOP-environment:

"Join Query Optimization with Deep Reinforcement Learning Algorithms" by Jonas Heitz and Kurt Stockinger, Zurich University of Applied Sciences, Winterthur, Switzerland

https://arxiv.org/abs/1911.11689

Basics

The source code is based on the gym from OpenAI. The code is divided in to two parts (Agent and Environment). Agent-Environment Feedback Loop

All the code was executed on Ubuntu version 18.04.1.

Environment

  • In the folder /gym/envs/database/ are the reinforcement learning environments defined to plan queries according to the template of gym.
  • In the folder /queryoptimization/ you find the files QueryGraph.py and cm1_postgres_card.py. The first takes over the parsing of simple SQL-queries and includes the logic of the query planning. Whereas cm1_postgres_card.py delivers the expected costs of a query object according to the cost model introduced in the paper “How good are query optimizers, really?” by Leis et al.

Agent

We used Ray RLLib to train our deep reinforcement learning models. Therefore, in the folder agents/run/ you find the following files:

  • config.py: With the configurations of the models vanilla DQN (SIMPLE_CONFIG), DDQN (DOUBLE_PRIO) and PPO (PPO_CONFIG).
  • execute.py: Includes the code to execute a set of experiments.
  • models.py: Includes the neural nets with the action-masking layer.
  • masking_env_cros.py: Prepares the environments to deliver the information needed for the action-masking layer in models.py.

In the folder /agents/rollout/ you find the scripts to test trained models. The folder /agents/queries/ contains the queries used for the experiments.

Installation

  1. Install PostrgreSQL
  2. Load IMDb according to the guide from the JOB
  3. Install Python 3.*
  4. Clone repository
  5. Install virtual environment from requirements.txt in the project folder
  6. As a last step you need to update the DB connection details and the path of the query files in the __init__() and reset() function of the environment files at /gym/envs/database/.

Run

With the script simple_corridor.py in /agents/run/ you can check if the installation of gym and ray works. To execute the experiments you can start execute.py in /agents/run/.

About

License:Other


Languages

Language:Python 99.9%Language:Shell 0.1%Language:Makefile 0.1%