ChTauchmann's starred repositories

llama-recipes

Scripts for fine-tuning Meta Llama3 with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supporting a number of candid inference solutions such as HF TGI, VLLM for local or cloud deployment. Demo apps to showcase Meta Llama3 for WhatsApp & Messenger.

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:11092Issues:91Issues:301

FlagEmbedding

Retrieval and Retrieval-augmented LLMs

Language:PythonLicense:MITStargazers:6292Issues:38Issues:900

Verba

Retrieval Augmented Generation (RAG) chatbot powered by Weaviate

Language:PythonLicense:BSD-3-ClauseStargazers:5112Issues:60Issues:190

mergekit

Tools for merging pretrained large language models.

Language:PythonLicense:LGPL-3.0Stargazers:4216Issues:46Issues:268

LongLoRA

Code and documents of LongLoRA and LongAlpaca (ICLR 2024 Oral)

Language:PythonLicense:Apache-2.0Stargazers:2567Issues:12Issues:170

Medusa

Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:2062Issues:33Issues:81

langserve

LangServe 🦜️🏓

Language:JavaScriptLicense:NOASSERTIONStargazers:1816Issues:20Issues:217

self-rag

This includes the original implementation of SELF-RAG: Learning to Retrieve, Generate and Critique through self-reflection by Akari Asai, Zeqiu Wu, Yizhong Wang, Avirup Sil, and Hannaneh Hajishirzi.

Language:PythonLicense:MITStargazers:1661Issues:17Issues:78

pyserini

Pyserini is a Python toolkit for reproducible information retrieval research with sparse and dense representations.

Language:PythonLicense:Apache-2.0Stargazers:1576Issues:19Issues:536

long_llama

LongLLaMA is a large language model capable of handling long contexts. It is based on OpenLLaMA and fine-tuned with the Focused Transformer (FoT) method.

Language:PythonLicense:Apache-2.0Stargazers:1443Issues:26Issues:24

LLMTest_NeedleInAHaystack

Doing simple retrieval from LLM models at various context lengths to measure accuracy

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:1357Issues:12Issues:25

Awesome-LLM-Reasoning

Reasoning in Large Language Models: Papers and Resources, including Chain-of-Thought, Instruction-Tuning and Multimodality.

TransformerLens

A library for mechanistic interpretability of GPT-style language models

Language:PythonLicense:MITStargazers:920Issues:13Issues:192

gritlm

Generative Representational Instruction Tuning

Language:Jupyter NotebookLicense:MITStargazers:493Issues:8Issues:42

Causality4NLP_Papers

A reading list for papers on causality for natural language processing (NLP)

tevatron

Tevatron - A flexible toolkit for neural retrieval research and development.

Language:PythonLicense:Apache-2.0Stargazers:445Issues:10Issues:89

landmark-attention

Landmark Attention: Random-Access Infinite Context Length for Transformers

Language:PythonLicense:Apache-2.0Stargazers:401Issues:40Issues:15

task_vectors

Editing Models with Task Arithmetic

dpr-scale

Scalable training for dense retrieval models.

soft-moe-pytorch

Implementation of Soft MoE, proposed by Brain's Vision team, in Pytorch

Language:PythonLicense:MITStargazers:229Issues:11Issues:8

ACL2023-Retrieval-LM.github.io

https://acl2023-retrieval-lm.github.io/

OpenMatch

An Open-Source Package for Information Retrieval

Language:PythonLicense:MITStargazers:141Issues:4Issues:58

DallEval

DALL-Eval: Probing the Reasoning Skills and Social Biases of Text-to-Image Generation Models (ICCV 2023)

Language:Jupyter NotebookLicense:MITStargazers:136Issues:7Issues:2

fneval

Functional Benchmarks and the Reasoning Gap

Language:TeXLicense:GPL-3.0Stargazers:73Issues:1Issues:7

belief-localization

This repository includes code for the paper "Does Localization Inform Editing? Surprising Differences in Where Knowledge Is Stored vs. Can Be Injected in Language Models."