Stephen Roller (stephenroller)

stephenroller

Geek Repo

Company:@facebookresearch

Location:NYC

Home Page:http://stephenroller.com/

Twitter:@stephenroller

Github PK Tool:Github PK Tool

Stephen Roller's starred repositories

evals

Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.

Language:PythonLicense:NOASSERTIONStargazers:13887Issues:258Issues:197

triton

Development repository for the Triton language and compiler

flash-attention

Fast and memory-efficient exact attention

Language:PythonLicense:BSD-3-ClauseStargazers:10775Issues:104Issues:781

ParlAI

A framework for training and evaluating AI models on a variety of openly available dialogue datasets.

Language:PythonLicense:MITStargazers:10427Issues:284Issues:1544

FlexGen

Running large language models on a single GPU for throughput-oriented scenarios.

Language:PythonLicense:Apache-2.0Stargazers:9002Issues:105Issues:79

Megatron-LM

Ongoing research training transformer models at scale

Language:PythonLicense:NOASSERTIONStargazers:8553Issues:151Issues:500

tokenizers

đź’Ą Fast State-of-the-Art Tokenizers optimized for Research and Production

Language:RustLicense:Apache-2.0Stargazers:8405Issues:118Issues:916

metaseq

Repo for external large-scale work

Language:PythonLicense:MITStargazers:6386Issues:109Issues:292

fairscale

PyTorch extensions for high performance and large scale training.

Language:PythonLicense:NOASSERTIONStargazers:2903Issues:43Issues:357

slurm

Slurm: A Highly Scalable Workload Manager

Language:CLicense:NOASSERTIONStargazers:2334Issues:125Issues:0

The-NLP-Pandect

A comprehensive reference for all topics related to Natural Language Processing

Language:PythonLicense:CC0-1.0Stargazers:1995Issues:129Issues:2

longformer

Longformer: The Long-Document Transformer

Language:PythonLicense:Apache-2.0Stargazers:1973Issues:41Issues:227

mkdocstrings

:blue_book: Automatic documentation from sources, for MkDocs.

Language:PythonLicense:ISCStargazers:1568Issues:14Issues:386

gpu-burn

Multi-GPU CUDA stress test

Language:C++License:BSD-2-ClauseStargazers:1156Issues:18Issues:67

bigscience

Central place for the engineering/scaling WG: documentation, SLURM scripts and logs, compute environment and data.

Language:ShellLicense:NOASSERTIONStargazers:937Issues:36Issues:19

filesystem_spec

A specification that python filesystems should adhere to.

Language:PythonLicense:BSD-3-ClauseStargazers:892Issues:22Issues:657

PrefixTuning

Prefix-Tuning: Optimizing Continuous Prompts for Generation

madgrad

MADGRAD Optimization Method

Language:PythonLicense:MITStargazers:797Issues:18Issues:10

MyST-Parser

An extended commonmark compliant parser, with bridges to docutils/sphinx

Language:PythonLicense:MITStargazers:689Issues:24Issues:417

hck

A sharp cut(1) clone.

Language:RustLicense:UnlicenseStargazers:678Issues:7Issues:28

ConvLab-2

ConvLab-2: An Open-Source Toolkit for Building, Evaluating, and Diagnosing Dialogue Systems

Language:PythonLicense:Apache-2.0Stargazers:442Issues:21Issues:119

openchat

OpenChat: Easy to use opensource chatting framework via neural networks

Language:PythonLicense:Apache-2.0Stargazers:440Issues:16Issues:25

lambdaprompt

λprompt - A functional programming interface for building AI systems

Language:PythonLicense:MITStargazers:367Issues:6Issues:5

Mephisto

A suite of tools for managing crowdsourcing tasks from the inception through to data packaging for research use.

Language:PythonLicense:MITStargazers:294Issues:16Issues:254

cascades

Python library which enables complex compositions of language models such as scratchpads, chain of thought, tool use, selection-inference, and more.

Language:PythonLicense:Apache-2.0Stargazers:181Issues:11Issues:0

ParlAI_SearchEngine

A search engine for ParlAI's BlenderBot project (and probably other ones as well)

Language:PythonLicense:CC-BY-4.0Stargazers:132Issues:4Issues:11

simmc

With the aim of building next generation virtual assistants that can handle multimodal inputs and perform multimodal actions, we introduce two new datasets (both in the virtual shopping domain), the annotation schema, the core technical tasks, and the baseline models. The code for the baselines and the datasets will be opensourced.

Language:PythonLicense:NOASSERTIONStargazers:130Issues:20Issues:27

self_talk

Code and data for the paper: "Unsupervised Common Sense Question Answering with Self-Talk"

Language:PythonLicense:Apache-2.0Stargazers:78Issues:3Issues:2

forked-pdb

Python pdb for multiple processes

Language:PythonLicense:Apache-2.0Stargazers:28Issues:5Issues:0

dotfiles

My dotfiles

Language:Vim ScriptStargazers:8Issues:3Issues:0