kykim0

kykim0

Geek Repo

Company:@google

Location:Bay Area / Seoul

Github PK Tool:Github PK Tool


Organizations
JuliaPOMDP
sisl
StanfordVL

kykim0's starred repositories

dreamer

Dream to Control: Learning Behaviors by Latent Imagination

Language:PythonLicense:Apache-2.0Stargazers:596Issues:0Issues:0

nlp-bible-code

자연어처리 바이블의 실습 자료입니다.

Language:Jupyter NotebookStargazers:49Issues:0Issues:0

KULLM

☁️ 구름(KULLM): 고려대학교에서 개발한, 한국어에 특화된 LLM

License:Apache-2.0Stargazers:546Issues:0Issues:0

SPIN

The official implementation of Self-Play Fine-Tuning (SPIN)

Language:PythonLicense:Apache-2.0Stargazers:901Issues:0Issues:0
Language:PythonLicense:Apache-2.0Stargazers:1098Issues:0Issues:0

data-selection-survey

A Survey on Data Selection for Language Models

License:CC0-1.0Stargazers:119Issues:0Issues:0

LESS

[ICML 2024] LESS: Selecting Influential Data for Targeted Instruction Tuning

Language:Jupyter NotebookLicense:MITStargazers:286Issues:0Issues:0

mamba

Mamba SSM architecture

Language:PythonLicense:Apache-2.0Stargazers:11778Issues:0Issues:0

calibration-framework

The net:cal calibration framework is a Python 3 library for measuring and mitigating miscalibration of uncertainty estimates, e.g., by a neural network.

Language:PythonLicense:Apache-2.0Stargazers:327Issues:0Issues:0

captum

Model interpretability and understanding for PyTorch

Language:PythonLicense:BSD-3-ClauseStargazers:4720Issues:0Issues:0
Language:PythonStargazers:436Issues:0Issues:0

OpenFE

OpenFE: automated feature generation with expert-level performance

Language:PythonLicense:MITStargazers:707Issues:0Issues:0

evolutionary-model-merge

Official repository of Evolutionary Optimization of Model Merging Recipes

Language:PythonLicense:Apache-2.0Stargazers:1106Issues:0Issues:0

reward-bench

RewardBench: the first evaluation tool for reward models.

Language:PythonLicense:Apache-2.0Stargazers:294Issues:0Issues:0
Language:PythonLicense:MITStargazers:3981Issues:0Issues:0

awesome-llm-human-preference-datasets

A curated list of Human Preference Datasets for LLM fine-tuning, RLHF, and eval.

License:MITStargazers:280Issues:0Issues:0
Language:PythonLicense:Apache-2.0Stargazers:284Issues:0Issues:0

stable-baselines3

PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.

Language:PythonLicense:MITStargazers:8410Issues:0Issues:0

tmux-resurrect

Persists tmux environment across system restarts.

Language:ShellLicense:MITStargazers:10968Issues:0Issues:0

alpaca-lora

Instruct-tune LLaMA on consumer hardware

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:18419Issues:0Issues:0

FastChat

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.

Language:PythonLicense:Apache-2.0Stargazers:35786Issues:0Issues:0

vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Language:PythonLicense:Apache-2.0Stargazers:23172Issues:0Issues:0

alignment-handbook

Robust recipes to align language models with human and AI preferences

Language:PythonLicense:Apache-2.0Stargazers:4233Issues:0Issues:0

ppo-implementation-details

The source code for the blog post The 37 Implementation Details of Proximal Policy Optimization

Language:PythonLicense:NOASSERTIONStargazers:585Issues:0Issues:0

tikzplotlib

:bar_chart: Save matplotlib figures as TikZ/PGFplots for smooth integration into LaTeX.

Language:PythonLicense:MITStargazers:2356Issues:0Issues:0
Language:PythonLicense:MITStargazers:2459Issues:0Issues:0

alpaca_farm

A simulation framework for RLHF and alternatives. Develop your RLHF method without collecting human data.

Language:PythonLicense:Apache-2.0Stargazers:741Issues:0Issues:0

nanoGPT

The simplest, fastest repository for training/finetuning medium-sized GPTs.

Language:PythonLicense:MITStargazers:34899Issues:0Issues:0

DirectBehaviorSpecification

Code to reproduce the Arena environment experiments from Direct Behavior Specification via Constrained Reinforcement Learning.

Language:ASP.NETLicense:NOASSERTIONStargazers:19Issues:0Issues:0

flax

Flax is a neural network library for JAX that is designed for flexibility.

Language:PythonLicense:Apache-2.0Stargazers:5820Issues:0Issues:0