yaolu

User data from Github https://github.com/yaolu

followers

following

stars

PhD student @uclnlp

London

https://yaolu.github.io

Organizations

uclnlp

Yao Lu's repositories

Multi-XScience

Multi-XScience: A Large-scale Dataset for Extreme Multi-document Summarization of Scientific Articles

MIT43 1 10

ordered-prompt

Language:PythonMIT10 1 2

random-prompt

Code and supplementary document

Language:PythonMIT8 1 2

prompt

MIT3 10

MATS

Supplementary file of paper: A Multi-task Learning Framework for Abstractive Text Summarization

Apache-2.02 1 1

clifton

SSH connection manager

Language:Rust000

cluster-health

Language:Python000

codespaces-jupyter

Explore machine learning and data science with Codespaces

Language:Jupyter NotebookMIT000

CTranslate2

Fast inference engine for Transformer models

Language:C++MIT000

cvmfs-tutorial-hpc-best-practices

Contents for "Best Practices for CernVM-FS in HPC" tutorial

Language:Jupyter Notebook000

darknet

Convolutional Neural Networks

Language:CNOASSERTION010

DNLC

DNLC: Differential Network Local Consistency Analysis

Language:RGPL-3.0010

ELI5

Scripts and links to recreate the ELI5 dataset.

Language:PythonNOASSERTION000

few-shot-learning

Few-shot Learning of GPT-3

Language:Python000

GPT-4-LLM

Apache-2.0000

gpt-neox

An implementation of model parallel autoregressive transformers on GPUs, based on the DeepSpeed library.

Language:PythonApache-2.0000

helm

Holistic Evaluation of Language Models (HELM), a framework to increase the transparency of language models (https://arxiv.org/abs/2211.09110).

Language:PythonApache-2.0000

hiersumm

Code for paper Hierarchical Transformers for Multi-Document Summarization in ACL2019

Language:PythonApache-2.0000

lm-evaluation-harness

A framework for few-shot evaluation of language models.

Language:PythonMIT000

Megatron-LM

Ongoing research training transformer models at scale

Language:PythonNOASSERTION000

ml-engineering

Machine Learning Engineering Open Book

Language:PythonCC-BY-SA-4.0000

Multi-News

Large-scale multi-document summarization dataset and code

Language:PythonNOASSERTION000

nanoGPT

The simplest, fastest repository for training/finetuning medium-sized GPTs.

Language:PythonMIT000

NLP-progress

Repository to track the progress in Natural Language Processing (NLP), including the datasets and the current state-of-the-art for the most common NLP tasks.

Language:PythonMIT010

RedPajama-Data

The RedPajama-Data repository contains code for preparing large datasets for training large language models.

Language:PythonApache-2.0000

Sentence-VAE

PyTorch Implementation of "Generating Sentences from a Continuous Space" by Bowman et al 2015 https://arxiv.org/abs/1511.06349

Language:Python010

spinn

NYU ML² work on sentence encoding with tree structure and dynamic graphs

Language:PythonApache-2.0010

transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Language:PythonApache-2.0000

writing-code-for-nlp-research-emnlp2018

A companion repository for the "Writing code for NLP Research" Tutorial at EMNLP 2018

Language:PythonApache-2.0000

yaolu.github.io

Personal Website

Language:CSSNOASSERTION000