Beast code in Giters

Niklas Muennighoff's repositories

sgpt

SGPT: GPT Sentence Embeddings for Semantic Search

Language:Jupyter NotebookMIT830 8 42

vilio

🥶Vilio: State-of-the-art VL models in PyTorch & PaddlePaddle

Language:PythonMIT88 3 9

FLAN

Provides a minimal implementation to extract FLAN datasets for further processing

Language:PythonApache-2.010 10

ytclipcc

Create Captions for any YouTube Clip using Wav2Vec2

Language:Jupyter NotebookMIT6 20

promptsource

Toolkit for creating, sharing and using natural language prompts.

Language:PythonApache-2.04 10

kto

Language:Python1 20

matrixshapes

Language modelling task to infer shapes of matrices - One of the most difficult tasks for models like GPT-3, GPT-J

Language:PythonApache-2.01 30

vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Apache-2.0100

alpaca_eval

An automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.

Language:Jupyter NotebookApache-2.0000

bigcode-evaluation-harness

A framework for the evaluation of autoregressive code generation language models.

Apache-2.0000

CS231n

CS231n at Stanford University

Language:Jupyter Notebook020

CS50

Computer Science Course taught at Harvard

Language:C020

CS50xAI

AI-specialization of CS50

Language:Python020

FastChat

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.

Language:PythonApache-2.0010

FlagEmbedding

Open-source Embeddings

Language:PythonMIT010

gritlm

Generative Representational Instruction Tuning

MIT000

licensed-pile

Repo to hold code and track issues for the collection of permissively licensed data

MIT000

lm-evaluation-harness

A framework for few-shot evaluation of autoregressive language models.

Language:PythonMIT010

megablocks

Language:PythonApache-2.0000

Megatron-DeepSpeed

Ongoing research training transformer language models at scale, including: BERT & GPT-2

Language:PythonNOASSERTION010

mteb

Massive Text Embedding Benchmark - Internal Development Git

Language:PythonApache-2.0010

Muennighoff.github.io

Language:JavaScriptMIT020

open-instruct

Language:PythonApache-2.0010

open_lm

A repository for research on medium sized language models.

Language:PythonMIT000

OpenDevin

🐚 OpenDevin: Code Less, Make More

MIT000

prompt_semantics

This repository accompanies our paper “Do Prompt-Based Models Really Understand the Meaning of Their Prompts?”

Language:PythonMIT010

scripts-public

Language:Jupyter Notebook020

t-zero

Reproduce results and replicate training fo T0 (Multitask Prompted Training Enables Zero-Shot Task Generalization)

Language:PythonApache-2.0010

transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Language:PythonApache-2.0010

udacity-dl

Udacity Deep-Learning Nanodegree 2020

Language:Jupyter Notebook020