Yejin Cho's starred repositories
openai-cookbook
Examples and guides for using the OpenAI API
text-generation-inference
Large Language Model Text Generation Inference
contextualized-topic-models
A python package to run contextualized topic modeling. CTMs combine contextualized embeddings (e.g., BERT) with topic models to get coherent topics. Published at EACL and ACL 2021 (Bianchi et al.).
KICE_slayer_AI_Korean
수능 국어 1등급에 도전하는 AI
awesome-korean-llm
Awesome list of Korean Large Language Models.
gutenberg-poetry-corpus
A corpus of poetry from Project Gutenberg
propbank-release
The official released annotations, both in .prop pointer format and as conll files. Does not contain the source texts
propbank-frames
Lexicon of frame files used by Propbank annotation. A searchable, readable version of the latest release is here: http://propbank.github.io/v3.4.0/frames/
riveter-nlp
Package to extract connotation frames
SWOWEN-2018
English Small World of Words SWOWEN-2018
backpacks-flash-attn
The original Backpack Language Model implementation, a fork of FlashAttention
FairytaleQAData
A dataset of over 10000 question and answer pairs written for storybooks.
knowledge_distillation
Repository for "Propagating Knowledge Updates to LMs Through Distillation" (NeurIPS 2023).
nanoBackpackLM
The simplest repository for training medium-sized BackpackLM for cs224n
features_in_context
Predict psycholoinguistic feature norms for words in context.
semantic-norms
Code for: Visualizing the Obvious: A Concreteness-based Ensemble Model for Noun Property Prediction
essentialism_in_llms
Materials for the paper "You are what you're for: Essentialist categorization in large language models" by Siying Zhang, Jingyuan She, Tobias Gerstenberg and David Rose.
StorySettings
This repository contains the dataset described in the forthcoming "Story Settings: A Dataset" in 5th Workshop on Narrative Understanding at ACL 2023