Yiheng Shu's starred repositories
factoid-wiki
Dense X Retrieval: What Retrieval Granularity Should We Use?
ret-robust
Implementation of the paper: "Making Retrieval-Augmented Language Models Robust to Irrelevant Context"
Semantic-Retrieval-Models
A curated list of awesome papers for Semantic Retrieval (TOIS Accepted: Semantic Models for the First-stage Retrieval: A Comprehensive Review).
parallelformers
Parallelformers: An Efficient Model Parallelization Toolkit for Deployment
InfiniteBench
Codes for the paper "∞Bench: Extending Long Context Evaluation Beyond 100K Tokens": https://arxiv.org/abs/2402.13718
stanford-openie-python
Stanford Open Information Extraction made simple!
FriendsDontLetFriends
Friends don't let friends make certain types of data visualization - What are they and why are they bad.
TableLlama
[NAACL'24] Dataset, code and models for "TableLlama: Towards Open Large Generalist Models for Tables".
sub-sentence-encoder
The official code repo for "Sub-Sentence Encoder: Contrastive Learning of Propositional Semantic Representations".
OpenBookQA
Code for experiments on OpenBookQA from the EMNLP 2018 paper "Can a Suit of Armor Conduct Electricity? A New Dataset for Open Book Question Answering"
generative_agents
Generative Agents: Interactive Simulacra of Human Behavior
awesome-openai-vision-api-experiments
Must-have resource for anyone who wants to experiment with and build on the OpenAI vision API 🔥