ChTauchmann's starred repositories
llama-recipes
Scripts for fine-tuning Meta Llama3 with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supporting a number of candid inference solutions such as HF TGI, VLLM for local or cloud deployment. Demo apps to showcase Meta Llama3 for WhatsApp & Messenger.
FlagEmbedding
Retrieval and Retrieval-augmented LLMs
long_llama
LongLLaMA is a large language model capable of handling long contexts. It is based on OpenLLaMA and fine-tuned with the Focused Transformer (FoT) method.
LLMTest_NeedleInAHaystack
Doing simple retrieval from LLM models at various context lengths to measure accuracy
Awesome-LLM-Reasoning
Reasoning in Large Language Models: Papers and Resources, including Chain-of-Thought, Instruction-Tuning and Multimodality.
TransformerLens
A library for mechanistic interpretability of GPT-style language models
Causality4NLP_Papers
A reading list for papers on causality for natural language processing (NLP)
landmark-attention
Landmark Attention: Random-Access Infinite Context Length for Transformers
task_vectors
Editing Models with Task Arithmetic
soft-moe-pytorch
Implementation of Soft MoE, proposed by Brain's Vision team, in Pytorch
ACL2023-Retrieval-LM.github.io
https://acl2023-retrieval-lm.github.io/
belief-localization
This repository includes code for the paper "Does Localization Inform Editing? Surprising Differences in Where Knowledge Is Stored vs. Can Be Injected in Language Models."