LAKan233

0

followers

0

following

stars

LAKan233's starred repositories

dash-sample-apps

Open-source demos hosted on Dash Gallery

Language:Jupyter NotebookMIT310800

prm800k

800,000 step-level correctness labels on LLM solutions to MATH problems

Language:PythonMIT136900

cleanrl

High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG)

Language:PythonNOASSERTION507300

eipo

Official codebase for Redeeming Intrinsic Rewards via Constrained Policy Optimization

Language:PythonMIT7500

awesome-RLHF

A curated list of reinforcement learning with human feedback resources (continually updated)

Apache-2.0315300

MedicalGPT

MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training Pipeline. 训练医疗大模型，实现了包括增量预训练(PT)、有监督微调(SFT)、RLHF、DPO、ORPO。

Language:PythonApache-2.0309900

CV_Interview

I hope this repo can help you a lot!

llama-pipeline-parallel

A prototype repo for hybrid training of pipeline parallel and distributed data parallel with comments on core code snippets. Feel free to copy code and launch discussions about the problems you have encoured.

Language:Python4400

LLaMA-Factory

A WebUI for Efficient Fine-Tuning of 100+ LLMs (ACL 2024)

Language:PythonApache-2.02908400

RL4LMs

A modular RL library to fine-tune language models to human preferences

Language:PythonApache-2.0215000

PPOxFamily

PPO x Family DRL Tutorial Course（决策智能入门级公开课：8节课帮你盘清算法理论，理顺代码逻辑，玩转决策AI应用实践）

Language:PythonApache-2.0186800

MAAC

Code for "Actor-Attention-Critic for Multi-Agent Reinforcement Learning" ICML 2019

Language:PythonMIT65600

moyu

🐟 在线摸鱼减压，今天你摸鱼了吗？

Language:JavaScriptMIT34000

DeepTrader

Language:Python7800

Human-Fall-Detection

Human Falling Detection

Language:Python4300

pytorch-maddpg

A pytorch implementation of MADDPG (multi-agent deep deterministic policy gradient)

Language:Python60200

light_mappo

Lightweight version of MAPPO to help you quickly migrate to your local environment.

Language:Python44500

t5-pegasus-chinese

基于GOOGLE T5中文生成式模型的摘要生成/指代消解，支持batch批量生成，多进程

Language:PythonMIT21100

longformer-chinese

chinese version of longformer

Language:Python10700

dodrio

Exploring attention weights in transformer-based models with linguistic knowledge.

Language:SvelteMIT34000

pytorch-A3C

Simple A3C implementation with pytorch + multiprocessing

Language:PythonMIT60100

STOCKS_TRADING_RL

Language:Python200

rl-bsmodel-with-costs

Option hedging strategies are investigated using two reinforcement learning algorithms: deep Q network and deep deterministic policy gradient.

Language:Jupyter Notebook1800

QuantResearch

Quantitative analysis, strategies and backtests

Language:Jupyter NotebookMIT187200

abu

阿布量化交易系统(股票，期权，期货，比特币，机器学习) 基于python的开源量化交易，量化投资架构

Language:PythonGPL-3.01170000

Options-Trading-Strategies-in-Python

Developing Options Trading Strategies using Technical Indicators and Quantitative Methods

Language:Python75200

ElegantRL

Massively Parallel Deep Reinforcement Learning. 🔥

Language:PythonNOASSERTION359500

FinRL

FinRL: Financial Reinforcement Learning. 🔥

Language:Jupyter NotebookMIT953700

Keras-GAN

Keras implementations of Generative Adversarial Networks.

Language:PythonMIT916900

gan

Tooling for GANs in TensorFlow

Language:Jupyter NotebookApache-2.092600