mmarius's starred repositories

peft

🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.

Language:PythonLicense:Apache-2.0Stargazers:15321Issues:103Issues:991

mesh-transformer-jax

Model parallel transformers in JAX and Haiku

Language:PythonLicense:Apache-2.0Stargazers:6250Issues:112Issues:205

FriendsDontLetFriends

Friends don't let friends make certain types of data visualization - What are they and why are they bad.

text-to-text-transfer-transformer

Code for the paper "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"

Language:PythonLicense:Apache-2.0Stargazers:6053Issues:103Issues:405

x-transformers

A simple but complete full-attention transformer with a set of promising experimental features from various papers

Language:PythonLicense:MITStargazers:4448Issues:53Issues:203

OpenPrompt

An Open-Source Framework for Prompt-Learning.

Language:PythonLicense:Apache-2.0Stargazers:4259Issues:43Issues:256

SimCSE

[EMNLP 2021] SimCSE: Simple Contrastive Learning of Sentence Embeddings https://arxiv.org/abs/2104.08821

Language:PythonLicense:MITStargazers:3331Issues:27Issues:266

adapters

A Unified Library for Parameter-Efficient and Modular Transfer Learning

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:2487Issues:30Issues:377

makemore

An autoregressive character-level language model for making more things

Language:PythonLicense:MITStargazers:2365Issues:33Issues:8

pythia

The hub for EleutherAI's work on interpretability and learning dynamics

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:2168Issues:32Issues:102

pyserini

Pyserini is a Python toolkit for reproducible information retrieval research with sparse and dense representations.

Language:PythonLicense:Apache-2.0Stargazers:1576Issues:19Issues:536

Megatron-DeepSpeed

Ongoing research training transformer language models at scale, including: BERT & GPT-2

Language:PythonLicense:NOASSERTIONStargazers:1293Issues:24Issues:143

cramming

Cramming the training of a (BERT-type) language model into limited compute.

Language:PythonLicense:MITStargazers:1272Issues:22Issues:34

bigscience

Central place for the engineering/scaling WG: documentation, SLURM scripts and logs, compute environment and data.

Language:ShellLicense:NOASSERTIONStargazers:968Issues:38Issues:19

tueplots

Figure sizes, font sizes, fonts, and more configurations at minimal overhead. Fix your journal papers, conference proceedings, and other scientific publications.

Language:PythonLicense:MITStargazers:652Issues:4Issues:54

mistral

Mistral: A strong, northwesterly wind: Framework for transparent and accessible large-scale language model training, built with Hugging Face 🤗 Transformers.

Language:PythonLicense:Apache-2.0Stargazers:549Issues:16Issues:96

seqio

Task-based datasets, preprocessing, and evaluation for sequence models.

Language:PythonLicense:Apache-2.0Stargazers:546Issues:15Issues:31

GradCache

Run Effective Large Batch Contrastive Learning Beyond GPU/TPU Memory Constraint

Language:PythonLicense:Apache-2.0Stargazers:335Issues:9Issues:29

few-shot-learning

Few-shot Learning of GPT-3

Language:PythonLicense:Apache-2.0Stargazers:333Issues:5Issues:8

glasbey

Algorithmically create or extend categorical colour palettes

Language:PythonLicense:MITStargazers:171Issues:3Issues:5

Channel-LM-Prompting

An original implementation of "Noisy Channel Language Model Prompting for Few-Shot Text Classification"

GC-DPR

Train Dense Passage Retriever (DPR) with a single GPU

Language:PythonLicense:NOASSERTIONStargazers:127Issues:3Issues:12

prompt_semantics

This repository accompanies our paper “Do Prompt-Based Models Really Understand the Meaning of Their Prompts?”

Language:PythonLicense:MITStargazers:83Issues:2Issues:1

composable-sft

A library for parameter-efficient and composable transfer learning for NLP with sparse fine-tunings.

Language:PythonLicense:NOASSERTIONStargazers:68Issues:8Issues:5

MCSE

NAACL 2022: MCSE: Multimodal Contrastive Learning of Sentence Embeddings

Language:PythonLicense:MITStargazers:52Issues:9Issues:3

pet

This repository contains the code for "How many data points is a prompt worth?"

Language:PythonLicense:Apache-2.0Stargazers:49Issues:3Issues:0

tilt-transfer

Code to run the TILT transfer learning experiments

bayesian-mi

This code accompanies the paper "Bayesian Framework for Information-Theoretic Probing" published in EMNLP 2021.