Dongsub Shim's starred repositories
alignment-handbook
Robust recipes to align language models with human and AI preferences
WeightWatcher
The WeightWatcher tool for predicting the accuracy of Deep Neural Networks
flash-attention
Fast and memory-efficient exact attention
causal-text-papers
Curated research at the intersection of causal inference and natural language processing.
open_llama
OpenLLaMA, a permissively licensed open source reproduction of Meta AI’s LLaMA 7B trained on the RedPajama dataset
RedPajama-Data
The RedPajama-Data repository contains code for preparing large datasets for training large language models.
LLM-Adapters
Code for our EMNLP 2023 Paper: "LLM-Adapters: An Adapter Family for Parameter-Efficient Fine-Tuning of Large Language Models"
awesome-totally-open-chatgpt
A list of totally open alternatives to ChatGPT
Task-Oriented-Dialogue-Research-Progress-Survey
A datasets and methods survey about task-oriented dialogue, including recent datasets and SOTA leaderboards.
toolformer-pytorch
Implementation of Toolformer, Language Models That Can Use Tools, by MetaAI
EconML
ALICE (Automated Learning and Intelligence for Causation and Economics) is a Microsoft Research project aimed at applying Artificial Intelligence concepts to economic decision making. One of its goals is to build a toolkit that combines state-of-the-art machine learning techniques with econometrics in order to bring automation to complex causal inference problems. To date, the ALICE Python SDK (econml) implements orthogonal machine learning algorithms such as the double machine learning work of Chernozhukov et al. This toolkit is designed to measure the causal effect of some treatment variable(s) t on an outcome variable y, controlling for a set of features x.
MT-Evaluation
Machine Translation (MT) Evaluation Scripts
Awesome-Simultaneous-Translation
Paper list of simultaneous translation / streaming translation, including text-to-text machine translation and speech-to-text translation.
MT-Reading-List
A machine translation reading list maintained by Tsinghua Natural Language Processing Group
promptsource
Toolkit for creating, sharing and using natural language prompts.