jsedoc's starred repositories

metaseq

Repo for external large-scale work

Language:PythonLicense:MITStargazers:6411Issues:109Issues:292

arxiv-latex-cleaner

arXiv LaTeX Cleaner: Easily clean the LaTeX code of your paper to submit to arXiv

Language:PythonLicense:Apache-2.0Stargazers:4908Issues:30Issues:49

coronavirus-data

This repository contains data on Coronavirus Disease 2019 (COVID-19) in New York City (NYC), from the NYC Department of Health and Mental Hygiene.

kani

kani (カニ) is a highly hackable microframework for chat-based language models with tool use/function calling. (NLP-OSS @ EMNLP 2023)

Language:PythonLicense:MITStargazers:535Issues:9Issues:13

ConvoKit

ConvoKit is a toolkit for extracting conversational features and analyzing social phenomena in conversations. It includes several large conversational datasets along with scripts exemplifying the use of the toolkit on these datasets.

Language:Jupyter NotebookLicense:MITStargazers:524Issues:25Issues:90

Seq2seqChatbots

A wrapper around tensor2tensor to flexibly train, interact, and generate data for neural chatbots.

Language:PythonLicense:MITStargazers:472Issues:40Issues:14

BARTScore

BARTScore: Evaluating Generated Text as Text Generation

Language:PythonLicense:Apache-2.0Stargazers:301Issues:7Issues:44

Mephisto

A suite of tools for managing crowdsourcing tasks from the inception through to data packaging for research use.

Language:PythonLicense:MITStargazers:296Issues:16Issues:254

neural_chat

Code to support training, evaluating and interacting neural network dialog models, and training them with reinforcement learning. Code to deploy a web server which hosts the models live online is available at: https://github.com/asmadotgh/neural_chat_web

Language:PythonLicense:MITStargazers:176Issues:7Issues:5

bias-bench

ACL 2022: An Empirical Survey of the Effectiveness of Debiasing Techniques for Pre-trained Language Models.

dataless-model-merging

Code release for Dataless Knowledge Fusion by Merging Weights of Language Models (https://openreview.net/forum?id=FCnohuR6AnM)

Language:PythonLicense:Apache-2.0Stargazers:75Issues:8Issues:6

ReviewRobot

Code for ReviewRobot: Explainable Paper Review Generation based on Knowledge Synthesis

Language:PythonLicense:MITStargazers:26Issues:3Issues:4

tree2code

tree2code: Learning Discrete Syntactic Codes for Structural Diverse Translation

Language:PythonLicense:GPL-3.0Stargazers:26Issues:4Issues:4

circa

Circa (meaning ‘approximately’) dataset aims to help machine learning systems to solve the problem of interpreting indirect answers to polar questions. The dataset contains pairs of yes/no questions and indirect answers, together with annotations for the interpretation of the answer. The data is collected in 10 different social conversation situations (eg. food preferences of a friend).

Stargazers:19Issues:0Issues:0
Language:PythonLicense:GPL-3.0Stargazers:10Issues:3Issues:0
Language:PythonStargazers:7Issues:6Issues:0

DialogCorpus

A large scale dialog corpus for pre-training

Language:PythonLicense:Apache-2.0Stargazers:7Issues:4Issues:2

fairwork

Server for the Fair Work Mechanical Turk script

Language:PythonLicense:MITStargazers:7Issues:8Issues:22

dialogue-pytorch

Repository for dialogue models which enhance response diversity or coherence, coded in Pytorch.

Language:PythonStargazers:6Issues:7Issues:0
Stargazers:5Issues:0Issues:0

metaeval-simplification

Meta-evaluation of automatic metrics in Text Simplification

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:3Issues:3Issues:0

EASL

Efficient Annotation of Scalar Labels

Language:PythonStargazers:2Issues:0Issues:0

Qualtrics-Collaboration-API

An easy way for NYU students to collaborate on Qualtrics surveys!

Language:PythonStargazers:2Issues:1Issues:0
Language:PythonLicense:MITStargazers:2Issues:3Issues:0
Language:Jupyter NotebookStargazers:1Issues:0Issues:0
Language:PythonLicense:MITStargazers:1Issues:3Issues:0