There are 0 repository under off-policy topic.
CURL: Contrastive Unsupervised Representation Learning for Sample-Efficient Reinforcement Learning
DrQ: Data regularized Q
RAD: Reinforcement Learning with Augmented Data
This is the pytorch implementation of Hindsight Experience Replay (HER) - Experiment on all fetch robotic environments.
⚡ Flashbax: Accelerated Replay Buffers in JAX
ExORL: Exploratory Data for Offline Reinforcement Learning
Official PyTorch code for "Recurrent Off-policy Baselines for Memory-based Continuous Control" (DeepRL Workshop, NeurIPS 21)
Causal RL: Reverse-Environment Network Integrated Actor-Critic Algorithm
solving a simple 4*4 Gridworld almost similar to openAI gym FrozenLake using Qlearning Temporal difference method Reinforcement Learning
PyTorch implementation of "Sample-efficient Imitation Learning via Generative Adversarial Nets"
Actor Prioritized Experience Replay
Off-Policy Correction for Actor-Critic Algorithms in Deep Reinforcement Learning
TensorFlow implementation of "Sample-efficient Imitation Learning via Generative Adversarial Nets"
DDPG and D4PG Continuous Control
This repository contains all of the Reinforcement Learning-related projects I've worked on. The projects are part of the graduate course at the University of Tehran.
This repository contains the implementation of a wide variety of Reinforcement Learning Projects in different applications of Bandit Algorithms, MDPs, Distributed RL and Deep RL. These projects include university projects and projects implemented due to interest in Reinforcement Learning.
PyTorch implementation of our work: "Lipschitzness Is All You Need To Tame Off-policy Generative Adversarial Imitation Learning"
Stochastic Weighted Twin Delayed Deep Deterministic Policy Gradient (SWTD3)
Collection of codes pertaining to my research in model-free RL algorithms.
A novel method to incorporate existing policy (Rule-based control) with Reinforcement Learning.
Contains PyTorch Implementation of the following off policy actor critic algorithms
Repository containing basic algorithm applied in python.
Safe and Robust Experience Sharing for Deterministic Policy Gradient Algorithms
An Optimistic Approach to the Q-Network Error in Actor-Critic Methods
Temporal Difference Method - Q-Learning Implementation for FrozenLake Grid Problem
My content of CS294 Deep Reinforcement Learning course, conduced by Sergey Levine from UC Berkeley.
PyTorch-implementation-DICE-algorithms
PyTorch implementation of our work: "Optimality Inductive Biases and Agnostic Guidelines for Offline Reinforcement Learning"
PROJECT MIGRATED TO CODEBERG - Reinforcement Learning in Multiplicative Domains
A RL agent that learns to play doom's deadly corridor based on DDQN and PER.