Reda ALAMI's repositories

A3C2-in-TensorFlow-2

Implementation of the Asynchronous Advantage Actor Critic with Communication in TensorFlow 2

Language:PythonStargazers:1Issues:2Issues:0

AirSim

Open source simulator for autonomous vehicles built on Unreal Engine / Unity, from Microsoft AI & Research

Language:C++License:NOASSERTIONStargazers:0Issues:1Issues:0

arena

DIAMBRA Arena

Language:LuaLicense:NOASSERTIONStargazers:0Issues:0Issues:0

awesome-generative-ai-guide

A one stop repository for generative AI research updates, interview resources, notebooks and much more!

License:MITStargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0

buffer-of-thought-llm

[NeurIPS 2024 Spotlight] Buffer of Thoughts: Thought-Augmented Reasoning with Large Language Models

License:MITStargazers:0Issues:0Issues:0
Language:Jupyter NotebookStargazers:0Issues:1Issues:0

DRL-code-pytorch

Concise pytorch implements of DRL algorithms, including REINFORCE, A2C, DQN, PPO(discrete and continuous), DDPG, TD3, SAC.

Language:PythonLicense:MITStargazers:0Issues:1Issues:0

GRID-playground

Platform for General Robot Intelligence Development

Language:PythonLicense:NOASSERTIONStargazers:0Issues:0Issues:0

gym-pybullet-drones

PyBullet Gym environments for single and multi-agent reinforcement learning of quadcopter control

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

jumanji

🌴 A Suite of Industry-Driven Hardware-Accelerated RL Environments written in JAX

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

llm-classifier

Classify data instantly using an LLM

License:NOASSERTIONStargazers:0Issues:0Issues:0

llm-course

Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:0Issues:0Issues:0

LLMs-Finetuning-Safety

We jailbreak GPT-3.5 Turbo’s safety guardrails by fine-tuning it on only 10 adversarially designed examples, at a cost of less than $0.20 via OpenAI’s APIs.

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

mab2rec

[AAAI 2024] Mab2Rec: Multi-Armed Bandits Recommender

Language:Jupyter NotebookStargazers:0Issues:0Issues:0
Language:PythonStargazers:0Issues:0Issues:0

ol-ems

Online learning algorithm for microgrid energy management based on MPC

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

OpenRLHF

An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & Mixtral)

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

OPTIMIZING-STOCK-TRADING-STRATEGY-WITH-REINFORCEMENT-LEARNING

This project is a part of my Data Science Internship at Technocolabs Softwares.

Language:Jupyter NotebookStargazers:0Issues:0Issues:0

RL-Bitcoin-trading-bot

Trying to create Reinforcement Learning powered Bitcoin trading bot

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

RL4RS

A Real-World Benchmark for Reinforcement Learning based Recommender System

Language:PythonLicense:CC-BY-SA-4.0Stargazers:0Issues:0Issues:0

roerich

Roerich is a python library of change point detection algorithms for time series.

Language:PythonLicense:BSD-2-ClauseStargazers:0Issues:0Issues:0

safe-rlhf

Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

Safety-J

Safety-J: Evaluating Safety with Critique

Stargazers:0Issues:0Issues:0

SPIN_official

The official implementation of Self-Play Fine-Tuning (SPIN)

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

TradeMaster

TradeMaster is an open-source platform for quantitative trading empowered by reinforcement learning :fire: :zap: :rainbow:

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:0Issues:0Issues:0

trl

Train transformer language models with reinforcement learning.

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

unsloth

Finetune Llama 3.2, Mistral, Phi & Gemma LLMs 2-5x faster with 80% less memory

License:Apache-2.0Stargazers:0Issues:0Issues:0