Miao Lu (MiaoLu3)

MiaoLu3

Geek Repo

Company:Stanford University

Location:Stanford, CA

Home Page:miaolu3.github.io

Github PK Tool:Github PK Tool

Miao Lu's starred repositories

Language:PythonLicense:Apache-2.0Stargazers:235Issues:0Issues:0

Diff4RLSurvey

This repository contains a collection of resources and papers on Diffusion Models for RL, accompanying the paper "Diffusion Models for Reinforcement Learning: A Survey"

License:Apache-2.0Stargazers:351Issues:0Issues:0

direct-preference-optimization

Reference implementation for DPO (Direct Preference Optimization)

Language:PythonLicense:Apache-2.0Stargazers:1936Issues:0Issues:0

alignment-handbook

Robust recipes to align language models with human and AI preferences

Language:PythonLicense:Apache-2.0Stargazers:4315Issues:0Issues:0

decision-pretrained-transformer

Implemention of the Decision-Pretrained Transformer (DPT) from the paper Supervised Pretraining Can Learn In-Context Reinforcement Learning.

Language:PythonStargazers:35Issues:0Issues:0

trl

Train transformer language models with reinforcement learning.

Language:PythonLicense:Apache-2.0Stargazers:8947Issues:0Issues:0
Language:Jupyter NotebookLicense:MITStargazers:182Issues:0Issues:0

cs324_p2

Project 2 (Building Large Language Models) for Stanford CS324: Understanding and Developing Large Language Models (Winter 2022)

Language:PythonLicense:MITStargazers:101Issues:0Issues:0

awesome-RLHF

A curated list of reinforcement learning with human feedback resources (continually updated)

License:Apache-2.0Stargazers:3133Issues:0Issues:0

awesome-causality-data

A data index for learning causality.

License:MITStargazers:421Issues:0Issues:0

NeuralCausalModels

Neural Causal Model (NCM) implementation by the authors of The Causal Neural Connection.

Language:PythonLicense:MITStargazers:17Issues:0Issues:0
Language:PythonStargazers:10Issues:0Issues:0

ccxt

A JavaScript / TypeScript / Python / C# / PHP cryptocurrency trading API with support for more than 100 bitcoin/altcoin exchanges

Language:PythonLicense:MITStargazers:32177Issues:0Issues:0

fullbatchtraining

Training vision models with full-batch gradient descent and regularization

Language:PythonLicense:LGPL-2.1Stargazers:37Issues:0Issues:0
Language:PythonStargazers:54Issues:0Issues:0
Language:PythonStargazers:10Issues:0Issues:0

RL-for-Markov-Exchange-Economy

Codes for the ICML 2022 accepted paper: *Welfare Maximization in Competitive Equilibrium: Reinforcement Learning for Markov Exchange Economy*.

Language:Jupyter NotebookLicense:MITStargazers:6Issues:0Issues:0

FinRL

FinRL: Financial Reinforcement Learning. 🔥

Language:Jupyter NotebookLicense:MITStargazers:9510Issues:0Issues:0

ecole

Extensible Combinatorial Optimization Learning Environments

Language:C++License:BSD-3-ClauseStargazers:312Issues:0Issues:0

RL-SCPO

The code of paper *Learning Robust Policy against Disturbance in Transition Dynamics via State-Conservative Policy Optimization*.

Language:PythonStargazers:14Issues:0Issues:0

learn2branch-ecole

Reimplementation of "Exact Combinatorial Optimization with Graph Convolutional Neural Networks" (NeurIPS 2019)

Language:PythonLicense:MITStargazers:29Issues:0Issues:0

cs-self-learning

计算机自学指南

Language:HTMLLicense:MITStargazers:54266Issues:0Issues:0

neural-tangents

Fast and Easy Infinite Neural Networks in Python

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:2256Issues:0Issues:0