Nealcly

Nealcly's starred repositories

Selective_Context

Compress your input to ChatGPT or other LLMs, to let them process 2x more content and save 40% memory and GPU time.

Language:Python20800

Grade-School Math with Irrelevant Context (GSM-IC) benchmark is an arithmetic reasoning dataset built upon GSM8K, by adding irrelevant sentences in problem descriptions. GSM-IC is constructed to evaluate the distractibility of language models.

4900

detect-pretrain-code

This repository provides an original implementation of Detecting Pretraining Data from Large Language Models by *Weijia Shi, *Anirudh Ajith, Mengzhou Xia, Yangsibo Huang, Daogao Liu , Terra Blevins , Danqi Chen , Luke Zettlemoyer.

Language:PythonApache-2.018900

RemeMo

[EMNLP 2023] Once Upon a *Time* in *Graph*: Relative-Time Pretraining for Complex Temporal Reasoning

Language:PythonMIT1400

HalluQA

Dataset and evaluation script for "Evaluating Hallucinations in Chinese Large Language Models"

Language:PythonApache-2.010100

gecdi

The repo of "Improving Seq2Seq Grammatical Error Correction via Decoding Interventions"

Language:PythonMIT2900

AutoDAN

The official implementation of our ICLR2024 paper "AutoDAN: Generating Stealthy Jailbreak Prompts on Aligned Large Language Models".

Language:Python15200

Knowledge-Constrained-Decoding

Official Code for EMNLP2023 Main Conference paper: "KCTS: Knowledge-Constrained Tree Search Decoding with Token-Level Hallucination Detection"

Language:Python2500

ctc-copy

[EMNLP'23] Code for "Non-autoregressive Text Editing with Copy-aware Latent Alignments".

Language:Python1900

BCR

Measuring and Reducing Model Update Regression in Structured Prediction for NLP

Language:Python300

EDeR

A Dataset for Event Dependency Relation Extraction

Language:Python800

AnnoCons

The web-based platform to visualize and annotate constituency tree.

Language:HTML600

RobustGEC

Code & Data for our Paper "RobustGEC: Robust Grammatical Error Correction Against Subtle Context Perturbation" (EMNLP 2023)

Language:PythonMIT1400

Instruction-Tuning-Papers

Reading list of Instruction-tuning. A trend starts from Natrural-Instruction (ACL 2022), FLAN (ICLR 2022) and T0 (ICLR 2022).

73500

belebele

Repo for the Belebele dataset, a massively multilingual reading comprehension dataset.

Language:PythonNOASSERTION30400

self-speculative-decoding

Code associated with the paper **Draft & Verify: Lossless Large Language Model Acceleration via Self-Speculative Decoding**

Language:Jupyter NotebookApache-2.010800

Xwin-LM

Xwin-LM: Powerful, Stable, and Reproducible LLM Alignment

Language:Python99900

EKD_Impacts_PKG

This is the respository for paper "Merge Conflicts! Exploring the Impacts of External Distractors to Parametric Knowledge Graphs"

Language:Python500

DoLa

Official implementation for the paper "DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models"

Language:Python36000

Baichuan2

A series of large language models developed by Baichuan Intelligent Technology

Language:PythonApache-2.0401700

LLMSpeculativeSampling

Fast inference from large lauguage models via speculative decoding

Language:Python40600

Post-Instruction

Language:Python2100

LLM-Ref

Language:Python500

chatgpt-prompts-for-academic-writing

This list of writing prompts covers a range of topics and tasks, including brainstorming research ideas, improving language and style, conducting literature reviews, and developing research plans.

257500

synjax

Language:PythonApache-2.023700

gpt_academic

为GPT/GLM等LLM大语言模型提供实用化交互接口，特别优化论文阅读/润色/写作体验，模块化设计，支持自定义快捷按钮&函数插件，支持Python和C++等项目剖析&自译解功能，PDF/LaTex论文翻译&总结功能，支持并行问询多种LLM模型，支持chatglm3等本地模型。接入通义千问, deepseekcoder, 讯飞星火, 文心一言, llama2, rwkv, claude2, moss等。

Language:PythonGPL-3.06063800

tdc2023-starter-kit

This is the starter kit for the Trojan Detection Challenge 2023 (LLM Edition), a NeurIPS 2023 competition.

Language:PythonMIT7600

tdc-starter-kit

Starter kit and data loading code for the Trojan Detection Challenge NeurIPS 2022 competition

Language:Jupyter NotebookMIT3400

alpaca_farm

A simulation framework for RLHF and alternatives. Develop your RLHF method without collecting human data.

Language:PythonApache-2.073100

direct-preference-optimization

Reference implementation for DPO (Direct Preference Optimization)

Language:PythonApache-2.0178400