Vance0124's repositories

Token-level-Direct-Preference-Optimization

Reference implementation for Token-level Direct Preference Optimization(TDPO)

Language:PythonLicense:Apache-2.0Stargazers:74Issues:1Issues:3

RL-Solutions

强化学习第二版习题解答与代码案例 Solutions and codes for Reinforcement Learning second edition

Language:Jupyter NotebookLicense:GPL-3.0Stargazers:2Issues:0Issues:0

DexterousHands

This is a library that provides dual dexterous hand manipulation tasks through Isaac Gym

Language:PythonLicense:Apache-2.0Stargazers:1Issues:0Issues:0

EvalAI

:cloud: :rocket: :bar_chart: :chart_with_upwards_trend: Evaluating state of the art in AI

Language:PythonLicense:NOASSERTIONStargazers:1Issues:0Issues:0

ucas-beamer

:scroll: UCAS Beamer (LaTeX)

Language:TeXLicense:MITStargazers:1Issues:0Issues:0
Language:JavaScriptLicense:MITStargazers:0Issues:0Issues:0

ChatPaper

Use ChatGPT to summarize the arXiv papers.

Language:PythonLicense:NOASSERTIONStargazers:0Issues:0Issues:0
Language:PythonStargazers:0Issues:0Issues:0

google-research

Google Research

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:0Issues:0Issues:0

reinforcement-learning-an-introduction

Python Implementation of Reinforcement Learning: An Introduction

Language:PythonLicense:MITStargazers:0Issues:0Issues:0