Nealcly

Nealcly's starred repositories

Llama-X

Open Academic Research on Improving LLaMA to SOTA LLM

Language:PythonApache-2.0158200

CTCWordBeamSearch

Connectionist Temporal Classification (CTC) decoder with dictionary and language model.

Language:C++MIT54900

Chinese-LLaMA-Alpaca

中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)

Language:PythonApache-2.01794100

Megatron-LM

Ongoing research training transformer models at scale

Language:PythonNOASSERTION945300

CHEF

The source code of paper "CHEF: A Pilot Chinese Dataset for Evidence-Based Fact-Checking"

Language:Python6500

stanford_alpaca

Code and documentation to train Stanford's Alpaca models, and generate the data.

Language:PythonApache-2.02917900

HMT

Source code for ICLR 2023 spotlight paper "Hidden Markov Transformer for Simultaneous Machine Translation"

Language:PythonMIT2100

text-generation-inference

Large Language Model Text Generation Inference

Language:PythonApache-2.0841400

llama

Inference code for Llama models

Language:PythonNOASSERTION5428500

Automated-Fact-Checking-Resources

Links to conference/journal publications in automated fact-checking (resources for the TACL22/EMNLP23 paper).

MIT36900

detect-gpt

DetectGPT: Zero-Shot Machine-Generated Text Detection using Probability Curvature

Language:PythonMIT34100

neural-Jacana

This is the code for neural-Jacana aligner, and the data for MultiMWA dataset.

Language:Python1900

genius

💡GENIUS – generating text using sketches! A strong text generation & data augmentation tool.

Language:Python17500

m2scorer

MaxMatch (M^2) Scorer - Evaluation program for grammatical error correction systems.

Language:PythonGPL-2.014400

kilogram

The KiloGram Tangrams dataset

Language:Jupyter Notebook5000

EditScorer

The code for EMNLP2022 paper "Improved grammatical error correction by ranking elementary edits"

Language:Python1800

following-instructions-human-feedback

115400

CoSDA-ML

CoSDA-ML: Multi-Lingual Code-Switching Data Augmentation for Zero-Shot Cross-Lingual NLP

Language:Python4900

Cross-Align

EMNLP2022 "Cross-Align: Modeling Deep Cross-lingual Interactions for Word Alignment"

Language:PythonApache-2.01500

NMLA-NAT

Code for NeurIPS 2022 Spotlight paper " Non-Monotonic Latent Alignments for CTC-Based Non-Autoregressive Machine Translation"

Language:PythonMIT2000

We leverage 14 datasets as OOD test data and conduct evaluations on 8 NLU tasks over 21 popularly used models. Our findings confirm that the OOD accuracy in NLP tasks needs to be paid more attention to since the significant performance decay compared to ID accuracy has been found in all settings.

Language:Python11400

UniSumm

UNISUMM: Unified Few-shot Summarization with Multi-Task Pre-Training and Prefix-Tuning

Language:PythonMIT6000