There are 0 repository under off-policy topic.
CURL: Contrastive Unsupervised Representation Learning for Sample-Efficient Reinforcement Learning
DrQ: Data regularized Q
RAD: Reinforcement Learning with Augmented Data
This is the pytorch implementation of Hindsight Experience Replay (HER) - Experiment on all fetch robotic environments.
⚡ Flashbax: Accelerated Replay Buffers in JAX
ExORL: Exploratory Data for Offline Reinforcement Learning
Official PyTorch code for "Recurrent Off-policy Baselines for Memory-based Continuous Control" (DeepRL Workshop, NeurIPS 21)
solving a simple 4*4 Gridworld almost similar to openAI gym FrozenLake using Qlearning Temporal difference method Reinforcement Learning
PyTorch implementation of "Sample-efficient Imitation Learning via Generative Adversarial Nets"
Actor Prioritized Experience Replay
TensorFlow implementation of "Sample-efficient Imitation Learning via Generative Adversarial Nets"
Off-Policy Correction for Actor-Critic Algorithms in Deep Reinforcement Learning
DDPG and D4PG Continuous Control
This repository contains the implementation of a wide variety of Reinforcement Learning Projects in different applications of Bandit Algorithms, MDPs, Distributed RL and Deep RL. These projects include university projects and projects implemented due to interest in Reinforcement Learning.
This repository contains all of the Reinforcement Learning-related projects I've worked on. The projects are part of the graduate course at the University of Tehran.
Collection of codes pertaining to my research in model-free RL algorithms.
A novel method to incorporate existing policy (Rule-based control) with Reinforcement Learning.
PyTorch implementation of our work: "Lipschitzness Is All You Need To Tame Off-policy Generative Adversarial Imitation Learning"
Stochastic Weighted Twin Delayed Deep Deterministic Policy Gradient (SWTD3)
Contains PyTorch Implementation of the following off policy actor critic algorithms
PyTorch-implementation-DICE-algorithms
Repository containing basic algorithm applied in python.
off-policy algorithm utilizing offline and online data
Safe and Robust Experience Sharing for Deterministic Policy Gradient Algorithms
An Optimistic Approach to the Q-Network Error in Actor-Critic Methods
Containing a custom-built Reinforcement Learning environment and implementations of key RL algorithms like Q-learning and SARSA, tested in scenarios such as a drone navigation challenge and the Frozen Lake environment.
Temporal Difference Method - Q-Learning Implementation for FrozenLake Grid Problem
My content of CS294 Deep Reinforcement Learning course, conduced by Sergey Levine from UC Berkeley.
PyTorch implementation of our work: "Optimality Inductive Biases and Agnostic Guidelines for Offline Reinforcement Learning"
PROJECT MIGRATED TO CODEBERG - Reinforcement Learning in Multiplicative Domains
PyTorch implementation of our work: "Where is the Grass Greener? Revisiting Generalized Policy Iteration for Offline Reinforcement Learning"
PyTorch implementation of "Sample-efficient Imitation Learning via Generative Adversarial Nets"