MORL: Multi-Objective Reinforcement Learning

Team Members

Frazier Baker - bakerfn@mail.uc.edu
Jeremiah Greer - greerji@mail.uc.edu

Advisor

Dr. Fred Annexstein - annexsfs@mail.uc.edu

Background

Reinforcement learning often involves having a machine "learn" multiple tasks, generally requiring rewards for each task. This is formulated as multi-objective optimization.

Problem Statement

We seek to find a novel multi-objective optimization method for use in reinforcement learning.

Current Solutions

Current methods for multi-objective optimization involve linear combinations of the reward terms; however, balancing each of the rewards has proven difficult.

OpenAI.com has suggested that Filter Methods may be useful for multi-objective optimization and have been studied very little in relation to Reinforcement Learning.

In addition, there are many other search or optimization methods which could be derived from other subfields of Artificial Intelligence that have yet to be explored.

Relevant Experience

Jeremiah Greer

see Jeremiah's Bio

Developed Data Analysis tool for World Bank in first co-op
Currently taking Intelligent Data Analysis
Currently taking Machine learning (Ng) & Neural Network (Hinton) courses on Coursera
Attended Deep Learning Seminar with Dr. Annexstein

Frazier Baker

see Frazier's Bio

Experience in Bioinformatics applications of Data Analysis and Machine Learning
- Multiple journal publications and conference proceedings.
Experience in Computational Physics Research with Dr. Gabriela Popa at Ohio University Zanesville.
- Multiple conference proceedings.
Took CS 6052: Intelligent Data Analysis from Dr. Raj Bhatnagar in Fall of 2015.
Currently taking CS 4033: Artificial Intelligence
Participated in Independent Study on Deep Learning with Dr. Fred Annexstein in Spring 2017.
Currently taking Parallel Computing with Dr. Fred Annexstein, exploring CUDA and distributed computing.

Our Approach

Our plan is to explore the use of filter methods for multi-objective optimization and reinforcement learning.

In addition, we will explore other areas of AI, particularly Genetic Programming, to find new methods of feature selection.

Next, we hope to test both of these using the publicly available OpenAI Gym MuJoCo environments.

Finally, we hope to explore applications to other datasets that would demonstrate real-world benefits, as time permits.

Task List

Link

ZhanPython / morl