MiaoLu3

Miao Lu's starred repositories

Diffusion-Policies-for-Offline-RL

Language:PythonApache-2.023500

Diff4RLSurvey

This repository contains a collection of resources and papers on Diffusion Models for RL, accompanying the paper "Diffusion Models for Reinforcement Learning: A Survey"

Apache-2.035100

direct-preference-optimization

Reference implementation for DPO (Direct Preference Optimization)

Language:PythonApache-2.0193600

alignment-handbook

Robust recipes to align language models with human and AI preferences

Language:PythonApache-2.0431500

decision-pretrained-transformer

Implemention of the Decision-Pretrained Transformer (DPT) from the paper Supervised Pretraining Can Learn In-Context Reinforcement Learning.

Language:Python3500

trl

Train transformer language models with reinforcement learning.

Language:PythonApache-2.0894700

in-context-learning

Language:Jupyter NotebookMIT18200

cs324_p2

Project 2 (Building Large Language Models) for Stanford CS324: Understanding and Developing Large Language Models (Winter 2022)

Language:PythonMIT10100

awesome-RLHF

A curated list of reinforcement learning with human feedback resources (continually updated)

Apache-2.0313300

awesome-causality-data

A data index for learning causality.

MIT42100

NeuralCausalModels

Neural Causal Model (NCM) implementation by the authors of The Causal Neural Connection.

Language:PythonMIT1700

MEX

Language:Python1000

ccxt

A JavaScript / TypeScript / Python / C# / PHP cryptocurrency trading API with support for more than 100 bitcoin/altcoin exchanges

Language:PythonMIT3217700

fullbatchtraining

Training vision models with full-batch gradient descent and regularization

Language:PythonLGPL-2.13700

edge-of-stability

Language:Python5400

LMCTS

Language:Python1000

RL-for-Markov-Exchange-Economy

Codes for the ICML 2022 accepted paper: *Welfare Maximization in Competitive Equilibrium: Reinforcement Learning for Markov Exchange Economy*.

Language:Jupyter NotebookMIT600

FinRL

FinRL: Financial Reinforcement Learning. 🔥

Language:Jupyter NotebookMIT951000

ecole

Extensible Combinatorial Optimization Learning Environments

Language:C++BSD-3-Clause31200

RL-SCPO

The code of paper *Learning Robust Policy against Disturbance in Transition Dynamics via State-Conservative Policy Optimization*.

Language:Python1400

learn2branch-ecole

Reimplementation of "Exact Combinatorial Optimization with Graph Convolutional Neural Networks" (NeurIPS 2019)

Language:PythonMIT2900

cs-self-learning

计算机自学指南

Language:HTMLMIT5426600

neural-tangents

Fast and Easy Infinite Neural Networks in Python

Language:Jupyter NotebookApache-2.0225600