Reda ALAMI's repositories
A3C2-in-TensorFlow-2
Implementation of the Asynchronous Advantage Actor Critic with Communication in TensorFlow 2
arena
DIAMBRA Arena
awesome-generative-ai-guide
A one stop repository for generative AI research updates, interview resources, notebooks and much more!
buffer-of-thought-llm
[NeurIPS 2024 Spotlight] Buffer of Thoughts: Thought-Augmented Reasoning with Large Language Models
DRL-code-pytorch
Concise pytorch implements of DRL algorithms, including REINFORCE, A2C, DQN, PPO(discrete and continuous), DDPG, TD3, SAC.
GRID-playground
Platform for General Robot Intelligence Development
gym-pybullet-drones
PyBullet Gym environments for single and multi-agent reinforcement learning of quadcopter control
jumanji
🌴 A Suite of Industry-Driven Hardware-Accelerated RL Environments written in JAX
llm-classifier
Classify data instantly using an LLM
llm-course
Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.
LLMs-Finetuning-Safety
We jailbreak GPT-3.5 Turbo’s safety guardrails by fine-tuning it on only 10 adversarially designed examples, at a cost of less than $0.20 via OpenAI’s APIs.
mab2rec
[AAAI 2024] Mab2Rec: Multi-Armed Bandits Recommender
ol-ems
Online learning algorithm for microgrid energy management based on MPC
OpenRLHF
An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & Mixtral)
OPTIMIZING-STOCK-TRADING-STRATEGY-WITH-REINFORCEMENT-LEARNING
This project is a part of my Data Science Internship at Technocolabs Softwares.
RL-Bitcoin-trading-bot
Trying to create Reinforcement Learning powered Bitcoin trading bot
RL4RS
A Real-World Benchmark for Reinforcement Learning based Recommender System
roerich
Roerich is a python library of change point detection algorithms for time series.
safe-rlhf
Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback
Safety-J
Safety-J: Evaluating Safety with Critique
SPIN_official
The official implementation of Self-Play Fine-Tuning (SPIN)
TradeMaster
TradeMaster is an open-source platform for quantitative trading empowered by reinforcement learning :fire: :zap: :rainbow:
trl
Train transformer language models with reinforcement learning.
unsloth
Finetune Llama 3.2, Mistral, Phi & Gemma LLMs 2-5x faster with 80% less memory