off-policy

There are 0 repository under off-policy topic.

MishaLaskin / curl
CURL: Contrastive Unsupervised Representation Learning for Sample-Efficient Reinforcement Learning
deep-learning contrastive-loss contrastive-learning contrastive-predictive-coding deep-reinforcement-learning deep-rl reinforcement-learning reinforcement-learning-algorithms curl sac gpu off-policy model-free-rl deep-neural-networks deeplearning deep-q-network deep-q-learning deep-learning-algorithms deeplearning-ai reinforcement-agents
Language:Python 572
denisyarats / drq
DrQ: Data regularized Q
rl reinforcement-learning deep-learning mujoco dm-control gym pixel sac soft-actor-crit pytorch python actor-critic control drq deep-reinforcement-learning data-augmentation off-policy model-free
Language:Jupyter Notebook 407
rad
MishaLaskin / rad
RAD: Reinforcement Learning with Augmented Data
reinforcement-learning rl deep-learning data-mujoc dm-control rad data-augmentations codebase model-free off-policy deep-reinforcement-learning deep-neural-networks deep-learning-algorithms deep-q-learning deep-q-network deeplearning-ai soft-actor-critic sac ppo
Language:Jupyter Notebook 401
TianhongDai / hindsight-experience-replay
This is the pytorch implementation of Hindsight Experience Replay (HER) - Experiment on all fetch robotic environments.
hindsight-experience-replay ddpg reinforcement-learning off-policy exploration pytorch-implmention her
Language:Python 400
instadeepai / flashbax
⚡ Flashbax: Accelerated Replay Buffers in JAX
buffers hpc jax machine-learning off-policy reinforcement-learning rl
Language:Python 213
pokaxpoka / sunrise
SUNRISE: A Simple Unified Framework for Ensemble Learning in Deep Reinforcement Learning
reinforcement-learning rl deep-learning mujoco dm-control codebase model-free off-policy deep-reinforcement-learning deep-neural-networks deep-q-learning deep-q-network soft-actor-critic sac rainbow
Language:Python 119
denisyarats / exorl
ExORL: Exploratory Data for Offline Reinforcement Learning
control datasets deep-learning exporation model-free mujoco off-policy offline-rl python pytorch reinforcement-learning unsupevised
Language:Python 105
zhihanyang2022 / off-policy-continuous-control
Official PyTorch code for "Recurrent Off-policy Baselines for Memory-based Continuous Control" (DeepRL Workshop, NeurIPS 21)
pytorch recurrent-neural-network actor-critic off-policy continuous-control reinforcement-learning rdpg rtd3 rsac
Language:Python 79
MohammadAsadolahi / Reinforcement-Learning-solving-a-simple-4by4-Gridworld-using-Qlearning-in-python
solving a simple 4*4 Gridworld almost similar to openAI gym FrozenLake using Qlearning Temporal difference method Reinforcement Learning
off-policy qlearning qlearning-on-gridworld reinforcement-learning
Language:Jupyter Notebook 14
lionelblonde / sam-pytorch
PyTorch implementation of "Sample-efficient Imitation Learning via Generative Adversarial Nets"
gail gan imitation-learning off-policy pytorch reinforcement-learning
Language:Python 12
baturaysaglam / LA3P
Actor Prioritized Experience Replay
actor-critic deep-reinforcement-learning prioritized-experience-replay off-policy
Language:Python 11
Rosefintech / Rosefintech-RosefinAIEngine
RosefinAIEngine of Rosfintech
ai engine off-policy tensorflow
Language:Python 11
lionelblonde / sam-tf
TensorFlow implementation of "Sample-efficient Imitation Learning via Generative Adversarial Nets"
gail imitation-learning reinforcement-learning tensorflow gan off-policy
Language:Python 10
baturaysaglam / AC-Off-POC
Off-Policy Correction for Actor-Critic Algorithms in Deep Reinforcement Learning
actor-critic deep-reinforcement-learning experience-replay importance-sampling off-policy
Language:Python 9
bmaxdk / DeepRL-ND-Continuous-Control
DDPG and D4PG Continuous Control
ddpg-algorithm deep-reinforcement-learning openai-gym unity d4pg-algorithm model-free off-policy pytorch
Language:ASP.NET 7
amirhosein-mesbah / Reinforcement_learning
This repository contains the implementation of a wide variety of Reinforcement Learning Projects in different applications of Bandit Algorithms, MDPs, Distributed RL and Deep RL. These projects include university projects and projects implemented due to interest in Reinforcement Learning.
bandit-algorithms deep-reinforcement-learning deeprl distributed-reinforcement-learning mdp multi-agent-reinforcement-learning network-routing off-policy on-policy reinforcement-learning gym stablebaselines3 q-learning
Language:Jupyter Notebook 5
narjesno / Reinforcement-Learning
This repository contains all of the Reinforcement Learning-related projects I've worked on. The projects are part of the graduate course at the University of Tehran.
dynamic-programming off-policy on-policy model-free-rl model-based-rl monte-carlo sarsa n-step-bootstrapping n-step-expected-sarsa n-step-tree-backup policy-iteration ucb-algorithm double-q-learning n-armed-bandit-problem policy-gradient epsilon-greedy
Language:HTML 5
cbanerji / Sample_efficient_RL.
Collection of codes pertaining to my research in model-free RL algorithms.
ddpg model-free-rl off-policy sample-efficient-rl td3 soft-actor-critic
Language:Python 3
HYDesmondLiu / RUBICON
A novel method to incorporate existing policy (Rule-based control) with Reinforcement Learning.
climate-change deep-learning deep-reinforcement-learning energy-efficiency hvac-control machine-learning optimal-control optimization reinforcement-learning-algorithms thermal-comfort actor-critic-algorithm deterministic-policy-gradients off-policy rule-based-controller reinforcement-learning
Language:Python 3
lionelblonde / liayn-pytorch
PyTorch implementation of our work: "Lipschitzness Is All You Need To Tame Off-policy Generative Adversarial Imitation Learning"
gail gan imitation-learning off-policy pytorch reinforcement-learning
Language:Python 3
baturaysaglam / SWTD3
Stochastic Weighted Twin Delayed Deep Deterministic Policy Gradient (SWTD3)
actor-critic deep-reinforcement-learning off-policy reinforcement-learning-algorithms
Language:Python 2
DjAzDeck / SPG
Sample Policy Gradient
action actor-critic algorithm continuous control deep deterministic learning model-free off-policy optimization policy reinforcement
Language:Python 2
SaminYeasar / off_policy_ac
Contains PyTorch Implementation of the following off policy actor critic algorithms
actor-critic ddpg mujoco off-policy pytorch reinforcement-learning sac td3
Language:Python 2
SaminYeasar / PyTorch-implementation-DICE-algorithms
PyTorch-implementation-DICE-algorithms
algeadice imitation-learning off-policy pytorch rl valuedice
Language:Python 2
TheUnsolvedDev / ReinforcementLearning
Repository containing basic algorithm applied in python.
reinforcement-learning algorithm policy-iteration policy-evaluation bandit-algorithms monte-carlo off-policy on-policy
Language:Jupyter Notebook 2
zZhiG / RLPD-PyTorch
off-policy algorithm utilizing offline and online data
off-policy offline-data online-data reinforcement-learning
Language:Python 2
baturaysaglam / DASE
Safe and Robust Experience Sharing for Deterministic Policy Gradient Algorithms
actor-critic deep-reinforcement-learning experience-replay multi-agent-reinforcement-learning off-policy
Language:Python 1
baturaysaglam / Q-Error-Exploration
An Optimistic Approach to the Q-Network Error in Actor-Critic Methods
actor-critic deep-reinforcement-learning experience-replay off-policy exploration-exploitation
Language:Python 1
fardinabbasi / Tabulated_RL
Containing a custom-built Reinforcement Learning environment and implementations of key RL algorithms like Q-learning and SARSA, tested in scenarios such as a drone navigation challenge and the Frozen Lake environment.
grid-world markov-decision-processes mdp off-policy on-policy q-learning sarsa tree-backup value-iteration
Language:Jupyter Notebook 1
Kalyani011 / RL-Q_Learning_Implementation
Temporal Difference Method - Q-Learning Implementation for FrozenLake Grid Problem
off-policy q-learning reinforcement-learning temporal-differencing-learning value-based
Language:Jupyter Notebook 1
mabirck / CS294-DeepRL
My content of CS294 Deep Reinforcement Learning course, conduced by Sergey Levine from UC Berkeley.
reinforcement-learning cs294 deep-learning neural-networks reinforcement policy-gradient on-policy off-policy deep-reinforcement-learning deep-neural-networks pytorch pytorch-tutorials
Language:Python 1
NUS-LID / RENAULT
Ensemble and Auxiliary Tasks for Data-Efficient Deep Reinforcement Learning
auxiliary-tasks data-efficient-learning deep-learning deep-q-learning deep-reinforcement-learning deep-rl ensemble-learning model-free-rl multi-task-learning off-policy
Language:Python 1
lionelblonde / giwr-pytorch
PyTorch implementation of our work: "Optimality Inductive Biases and Agnostic Guidelines for Offline Reinforcement Learning"
imitation-learning off-policy offline pytorch reinforcement-learning
Language:Python 0
raja-grewal / rlmd
PROJECT MIGRATED TO CODEBERG - Reinforcement Learning in Multiplicative Domains
artificial-intelligence deep-reinforcement-learning energy-efficiency extreme-value-statistics gym model-free-rl python risk-management trading-algorithms ergodicity geometric-brownian-motion law-of-large-numbers pytorch q-learning reinforcement-learning tail-estimation target-tracking off-policy sac td3
0
lionelblonde / giwr-pytorch-complete-history
PyTorch implementation of our work: "Where is the Grass Greener? Revisiting Generalized Policy Iteration for Offline Reinforcement Learning"
imitation-learning off-policy offline pytorch reinforcement-learning
Language:Python
lionelblonde / sam-pytorch-complete-history
PyTorch implementation of "Sample-efficient Imitation Learning via Generative Adversarial Nets"
gail gan imitation-learning off-policy pytorch reinforcement-learning
Language:Python

off-policy

MishaLaskin / curl

denisyarats / drq

MishaLaskin / rad

TianhongDai / hindsight-experience-replay

instadeepai / flashbax

pokaxpoka / sunrise

denisyarats / exorl

zhihanyang2022 / off-policy-continuous-control

MohammadAsadolahi / Reinforcement-Learning-solving-a-simple-4by4-Gridworld-using-Qlearning-in-python

lionelblonde / sam-pytorch

baturaysaglam / LA3P

Rosefintech / Rosefintech-RosefinAIEngine

lionelblonde / sam-tf

baturaysaglam / AC-Off-POC

bmaxdk / DeepRL-ND-Continuous-Control

amirhosein-mesbah / Reinforcement_learning

narjesno / Reinforcement-Learning

cbanerji / Sample_efficient_RL.

HYDesmondLiu / RUBICON

lionelblonde / liayn-pytorch

baturaysaglam / SWTD3

DjAzDeck / SPG

SaminYeasar / off_policy_ac

SaminYeasar / PyTorch-implementation-DICE-algorithms

TheUnsolvedDev / ReinforcementLearning

zZhiG / RLPD-PyTorch

baturaysaglam / DASE

baturaysaglam / Q-Error-Exploration

fardinabbasi / Tabulated_RL

Kalyani011 / RL-Q_Learning_Implementation

mabirck / CS294-DeepRL

NUS-LID / RENAULT

lionelblonde / giwr-pytorch

raja-grewal / rlmd

lionelblonde / giwr-pytorch-complete-history

lionelblonde / sam-pytorch-complete-history