Gu Pengjie's starred repositories
outer-value-function-meta-rl
Code of the paper: Debiasing Meta-Gradient Reinforcement Learning by Learning the Outer Value Function
efficient-kan
An efficient pure-PyTorch implementation of Kolmogorov-Arnold Network (KAN).
f-divergence-dpo
Direct preference optimization with f-divergences.
alignment-handbook
Robust recipes to align language models with human and AI preferences
Academic-project-page-template
A project page template for academic papers. Demo at https://eliahuhorwitz.github.io/Academic-project-page-template/
direct-preference-optimization
Reference implementation for DPO (Direct Preference Optimization)
video-subtitle-extractor
视频硬字幕提取,生成srt文件。无需申请第三方API,本地实现文本识别。基于深度学习的视频字幕提取框架,包含字幕区域检测、字幕内容提取。A GUI tool for extracting hard-coded subtitle (hardsub) from videos and generating srt files.
LLM-Agents-Papers
A repo lists papers related to LLM based agent
lm-human-preferences
Code for the paper Fine-Tuning Language Models from Human Preferences
Awesome-LLM-for-RecSys
Survey: A collection of AWESOME papers and resources on the large language model (LLM) related recommender system topics.
AlignLLMHumanSurvey
Aligning Large Language Models with Human: A Survey
Variational-Recurrent-Models
Codes for the study "Variational Recurrent Models for Solving Partially Observable Control Tasks", published as a conference paper at ICLR 2020 (https://openreview.net/forum?id=r1lL4a4tDB)
dreamerv3-torch
Implementation of Dreamer v3 in pytorch.
awesome-offline-rl
An index of algorithms for offline reinforcement learning (offline-rl)
TradeMaster
TradeMaster is an open-source platform for quantitative trading empowered by reinforcement learning :fire: :zap: :rainbow:
unity-ml-agents-turret-defense
A reinforcement learning agent playing as the turret, where its goal is to allow ten friendly units to enter the base, and loses if an enemy unit has entered the base or if two friendly units were shot.
stable-diffusion
A latent text-to-image diffusion model
PyTorch-Pretrained-ViT
Vision Transformer (ViT) in PyTorch