Beast code in Giters

Gu Pengjie's starred repositories

outer-value-function-meta-rl

Code of the paper: Debiasing Meta-Gradient Reinforcement Learning by Learning the Outer Value Function

Language:Jupyter Notebook1300

efficient-kan

An efficient pure-PyTorch implementation of Kolmogorov-Arnold Network (KAN).

Language:PythonMIT361900

SimPO

SimPO: Simple Preference Optimization with a Reference-Free Reward

Language:PythonMIT54200

f-divergence-dpo

Direct preference optimization with f-divergences.

Language:PythonApache-2.0900

alignment-handbook

Robust recipes to align language models with human and AI preferences

Language:PythonApache-2.0424700

The Cradle framework is a first attempt at General Computer Control (GCC). Cradle supports agents to ace any computer task by enabling strong reasoning abilities, self-improvment, and skill curation, in a standardized general environment with minimal requirements.

Language:PythonMIT120400

Academic-project-page-template

A project page template for academic papers. Demo at https://eliahuhorwitz.github.io/Academic-project-page-template/

Language:JavaScript155200

cpl

Code for Contrastive Preference Learning (CPL)

Language:PythonMIT14200

direct-preference-optimization

Reference implementation for DPO (Direct Preference Optimization)

Language:PythonApache-2.0188900

Synapse

[ICLR 2024] Trajectory-as-Exemplar Prompting with Memory for Computer Control

Language:HTMLMIT4100

video-subtitle-extractor

视频硬字幕提取，生成srt文件。无需申请第三方API，本地实现文本识别。基于深度学习的视频字幕提取框架，包含字幕区域检测、字幕内容提取。A GUI tool for extracting hard-coded subtitle (hardsub) from videos and generating srt files.

Language:PythonApache-2.0535000

emnlp22-re3-story-generation

Language:PythonMIT24100

LLM-Agents-Papers

A repo lists papers related to LLM based agent

Language:Python85500

lm-human-preferences

Code for the paper Fine-Tuning Language Models from Human Preferences

Language:PythonMIT117500

d3rlpy

An offline deep reinforcement learning library

Language:PythonMIT126100

Awesome-LLM-for-RecSys

Survey: A collection of AWESOME papers and resources on the large language model (LLM) related recommender system topics.

MIT85500

oprl

Official Codebase for TMLR 2023, Benchmarks and Algorithms for Offline Preference-Based Reward Learning

Language:PythonMIT1600

AlignLLMHumanSurvey

Aligning Large Language Models with Human: A Survey

64300

tsne-cuda

GPU Accelerated t-SNE for CUDA with Python bindings

Language:CudaBSD-3-Clause175400

Variational-Recurrent-Models

Codes for the study "Variational Recurrent Models for Solving Partially Observable Control Tasks", published as a conference paper at ICLR 2020 (https://openreview.net/forum?id=r1lL4a4tDB)

Language:PythonMIT4900

VQ-VAE

Minimalist implementation of VQ-VAE in Pytorch

Language:PythonBSD-3-Clause48100

vqvae

VQ-VAE implementation in pytorch, supporting EMA and Gumbel trainings. Applicable for images and time series.

Language:Jupyter Notebook900

dreamerv2

Pytorch implementation of Dreamer-v2: Visual Model Based RL Algorithm.

Language:PythonMIT23400

dreamerv3-torch

Implementation of Dreamer v3 in pytorch.

Language:PythonMIT35800

awesome-offline-rl

An index of algorithms for offline reinforcement learning (offline-rl)

88600

TradeMaster

TradeMaster is an open-source platform for quantitative trading empowered by reinforcement learning :fire: :zap: :rainbow:

Language:Jupyter NotebookApache-2.0125700

unity-ml-agents-turret-defense

A reinforcement learning agent playing as the turret, where its goal is to allow ten friendly units to enter the base, and loses if an enemy unit has entered the base or if two friendly units were shot.

Language:TeXApache-2.01600

stable-diffusion

A latent text-to-image diffusion model

Language:Jupyter NotebookNOASSERTION6678200

OpenPSG

Benchmarking Panoptic Scene Graph Generation (PSG), ECCV'22

Language:PythonMIT40300

PyTorch-Pretrained-ViT

Vision Transformer (ViT) in PyTorch

Language:Python76100