LAKan233

LAKan233

Geek Repo

0

followers

0

following

Github PK Tool:Github PK Tool

LAKan233's starred repositories

dash-sample-apps

Open-source demos hosted on Dash Gallery

Language:Jupyter NotebookLicense:MITStargazers:3108Issues:0Issues:0

prm800k

800,000 step-level correctness labels on LLM solutions to MATH problems

Language:PythonLicense:MITStargazers:1369Issues:0Issues:0

cleanrl

High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG)

Language:PythonLicense:NOASSERTIONStargazers:5073Issues:0Issues:0

eipo

Official codebase for Redeeming Intrinsic Rewards via Constrained Policy Optimization

Language:PythonLicense:MITStargazers:75Issues:0Issues:0

awesome-RLHF

A curated list of reinforcement learning with human feedback resources (continually updated)

License:Apache-2.0Stargazers:3153Issues:0Issues:0

MedicalGPT

MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training Pipeline. 训练医疗大模型,实现了包括增量预训练(PT)、有监督微调(SFT)、RLHF、DPO、ORPO。

Language:PythonLicense:Apache-2.0Stargazers:3099Issues:0Issues:0

CV_Interview

I hope this repo can help you a lot!

Stargazers:1165Issues:0Issues:0

llama-pipeline-parallel

A prototype repo for hybrid training of pipeline parallel and distributed data parallel with comments on core code snippets. Feel free to copy code and launch discussions about the problems you have encoured.

Language:PythonStargazers:44Issues:0Issues:0

LLaMA-Factory

A WebUI for Efficient Fine-Tuning of 100+ LLMs (ACL 2024)

Language:PythonLicense:Apache-2.0Stargazers:29084Issues:0Issues:0

RL4LMs

A modular RL library to fine-tune language models to human preferences

Language:PythonLicense:Apache-2.0Stargazers:2150Issues:0Issues:0

PPOxFamily

PPO x Family DRL Tutorial Course(决策智能入门级公开课:8节课帮你盘清算法理论,理顺代码逻辑,玩转决策AI应用实践 )

Language:PythonLicense:Apache-2.0Stargazers:1868Issues:0Issues:0

MAAC

Code for "Actor-Attention-Critic for Multi-Agent Reinforcement Learning" ICML 2019

Language:PythonLicense:MITStargazers:656Issues:0Issues:0

moyu

🐟 在线摸鱼减压,今天你摸鱼了吗?

Language:JavaScriptLicense:MITStargazers:340Issues:0Issues:0
Language:PythonStargazers:78Issues:0Issues:0

Human-Fall-Detection

Human Falling Detection

Language:PythonStargazers:43Issues:0Issues:0

pytorch-maddpg

A pytorch implementation of MADDPG (multi-agent deep deterministic policy gradient)

Language:PythonStargazers:602Issues:0Issues:0

light_mappo

Lightweight version of MAPPO to help you quickly migrate to your local environment.

Language:PythonStargazers:445Issues:0Issues:0

t5-pegasus-chinese

基于GOOGLE T5中文生成式模型的摘要生成/指代消解,支持batch批量生成,多进程

Language:PythonLicense:MITStargazers:211Issues:0Issues:0

longformer-chinese

chinese version of longformer

Language:PythonStargazers:107Issues:0Issues:0

dodrio

Exploring attention weights in transformer-based models with linguistic knowledge.

Language:SvelteLicense:MITStargazers:340Issues:0Issues:0

pytorch-A3C

Simple A3C implementation with pytorch + multiprocessing

Language:PythonLicense:MITStargazers:601Issues:0Issues:0
Language:PythonStargazers:2Issues:0Issues:0

rl-bsmodel-with-costs

Option hedging strategies are investigated using two reinforcement learning algorithms: deep Q network and deep deterministic policy gradient.

Language:Jupyter NotebookStargazers:18Issues:0Issues:0

QuantResearch

Quantitative analysis, strategies and backtests

Language:Jupyter NotebookLicense:MITStargazers:1872Issues:0Issues:0

abu

阿布量化交易系统(股票,期权,期货,比特币,机器学习) 基于python的开源量化交易,量化投资架构

Language:PythonLicense:GPL-3.0Stargazers:11700Issues:0Issues:0

Options-Trading-Strategies-in-Python

Developing Options Trading Strategies using Technical Indicators and Quantitative Methods

Language:PythonStargazers:752Issues:0Issues:0

ElegantRL

Massively Parallel Deep Reinforcement Learning. 🔥

Language:PythonLicense:NOASSERTIONStargazers:3595Issues:0Issues:0

FinRL

FinRL: Financial Reinforcement Learning. 🔥

Language:Jupyter NotebookLicense:MITStargazers:9537Issues:0Issues:0

Keras-GAN

Keras implementations of Generative Adversarial Networks.

Language:PythonLicense:MITStargazers:9169Issues:0Issues:0

gan

Tooling for GANs in TensorFlow

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:926Issues:0Issues:0