NagisaZj

NagisaZj

Geek Repo

Github PK Tool:Github PK Tool

NagisaZj's repositories

Language:PythonLicense:MITStargazers:5Issues:1Issues:0
Language:PythonLicense:MITStargazers:5Issues:1Issues:2

bigcode-evaluation-harness

A framework for the evaluation of autoregressive code generation language models.

License:Apache-2.0Stargazers:0Issues:0Issues:0

ContextWM

Code release for "Pre-training Contextualized World Models with In-the-wild Videos for Reinforcement Learning" (NeurIPS 2023), https://arxiv.org/abs/2305.18499

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

decision-transformer

Official codebase for Decision Transformer: Reinforcement Learning via Sequence Modeling.

License:MITStargazers:0Issues:0Issues:0

diffusion_policy

[RSS 2023] Diffusion Policy Visuomotor Policy Learning via Action Diffusion

License:MITStargazers:0Issues:0Issues:0

diffusion_reward

[arXiv'23] Official implementation of the paper "Diffusion Reward: Learning Rewards via Conditional Video Diffusion"

License:MITStargazers:0Issues:0Issues:0

dreamerv3

Mastering Diverse Domains through World Models

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

DrM

DrM, a visual RL algorithm, minimizes the dormant ratio to guide exploration-exploitation trade-offs, achieving significant improvements in sample efficiency and asymptotic performance across diverse domains.

License:MITStargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0

evals

Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.

License:NOASSERTIONStargazers:0Issues:0Issues:0

Graphormer

Graphormer is a general-purpose deep learning backbone for molecular modeling.

License:MITStargazers:0Issues:0Issues:0

HIQL

HIQL: Offline Goal-Conditioned RL with Latent States as Actions (NeurIPS 2023)

License:MITStargazers:0Issues:0Issues:0

hypnettorch

Package for working with hypernetworks in PyTorch.

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

icl-alignment

Is In-Context Learning Sufficient for Instruction Following in LLMs?

License:Apache-2.0Stargazers:0Issues:0Issues:0
Language:PythonLicense:MITStargazers:0Issues:0Issues:0
Language:PythonLicense:MITStargazers:0Issues:0Issues:0
License:MITStargazers:0Issues:0Issues:0

llama3

The official Meta Llama 3 GitHub site

License:NOASSERTIONStargazers:0Issues:0Issues:0
Language:PythonLicense:MITStargazers:0Issues:0Issues:0

MiniGPT-4

Open-sourced codes for MiniGPT-4 and MiniGPT-v2

License:BSD-3-ClauseStargazers:0Issues:0Issues:0
Language:PythonLicense:MITStargazers:0Issues:0Issues:0

octo

Octo is a transformer-based robot policy trained on a diverse mix of 800k robot trajectories.

License:MITStargazers:0Issues:0Issues:0

OpenRLHF

An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & Mixtral)

License:Apache-2.0Stargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0

opro

official code for "Large Language Models as Optimizers"

License:Apache-2.0Stargazers:0Issues:0Issues:0
License:MITStargazers:0Issues:0Issues:0

universal_manipulation_interface

Universal Manipulation Interface: In-The-Wild Robot Teaching Without In-The-Wild Robots

License:MITStargazers:0Issues:0Issues:0

viper_rl

Using advances in generative modeling to learn reward functions from unlabeled videos.

Language:Jupyter NotebookLicense:MITStargazers:0Issues:0Issues:0