seven8827's repositories
understanding-rl-vision
Code for the paper "Understanding RL Vision"
AEGD
Adaptive gradient descent with energy
gym-recording
Add-on package to gym, to record sequences of actions, observations, and rewards
episodic-curiosity
Tensorflow/Keras code and trained models for Episodic Curiosity Through Reachability
distribution-is-all-you-need
The basic distribution probability Tutorial for Deep Learning Researchers
safety-starter-agents
Basic constrained RL agents used in experiments for the "Benchmarking Safe Exploration in Deep Reinforcement Learning" paper.
reinforcement-learning
GridWorld solved with VI, PI, SARSA, Expected SARSA, SARSA Lambda, Q learning, Double Q learning.
copg
This repository contains all code and experiments for competitive policy gradient (CoPG) algorithm.
House3D
a Realistic and Rich 3D Environment
tabular-methods
Tabular methods for reinforcement learning
reinforcement-learning-an-introduction-2
My solutions to Sutton & Barto - Reinforcement Learning
gradient_descent_viz
interactive visualization of 5 popular gradient descent methods with step-by-step illustration and hyperparameter tuning UI
Visual-Explanation-in-Deep-Reinforcement-Learning
This project visualizes the knowledge of an agent trained by Deep Reinforcement Learning (paper will be published) using Backpropagation, Guided Backpropagation, GradCam and Guided gradCam. It shows why the agent is performing the action. Which pixels had the biggest influence on the decision of the agent.
Deep-CFR
Scalable Implementation of Deep CFR and Single Deep CFR
RL-Double-Q-learning
A project comparing regular and double Q-learning reinforcement learning algorithms on different grid-world environments
why-clipping-accelerates
A pytorch implementation for the LSTM experiments in the paper: Why Gradient Clipping Accelerates Training: A Theoretical Justification for Adaptivity
SV-RL
[ICLR 2020, Oral] Harnessing Structures for Value-Based Planning and Reinforcement Learning
Meta-MDP-Reproduction
Code for reproduction of "A Meta-MDP Approach to Exploration for Lifelong Reinforcement Learning", submitted for the replication track of the NeurIPS 2019 Reproducibility Challenge.
DR-PG
Code for the paper "From Importance Sampling to Doubly Robust Policy Gradient"
rlpy
A pytorch-version implementation of RL algorithms. Now it collects TRPO, ClipPPO, A2C, GAIL and ADCV.
multiagent-competition
Code for the paper "Emergent Complexity via Multi-agent Competition"
darknet_ros
YOLO ROS: Real-Time Object Detection for ROS
optimaltransport.github.io
Web site of the Computational Optimal Transport book
hand_eye_calibration
Python tools to perform time-synchronization and hand-eye calibration.
pytorch-a3c
PyTorch implementation of Asynchronous Advantage Actor Critic (A3C) from "Asynchronous Methods for Deep Reinforcement Learning".