Saibo-creator / Awesome-LLM-Constrained-Decoding

A curated list of papers related to constrained decoding of LLM, along with their relevant code and resources.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Awesome-LLM-Constrained-Decoding

Towards reliable, controllable and more efficient generation with Large Language Models (LLMs)

A curated list of papers related to constrained decoding of LLM, along with their relevant code and resources.

Table of Contents

Libraries

Library Feature Stars
guidance-ai/guidance CFG, Regex, JSON Schema, Token Forcing, compatible with Transformers, LLAMA-CPP Stars
outlines-dev/outlines CFG, Unicode support, Hugging Face ecosystem, VLLM support Stars
sgl-project/sglang Regex support, emphasis on LLM inference efficiency, compressed FSM Stars
eth-sri/lmql Regex support, various constraints, more powerful control flow Stars
jxnl/instructor Try-Reject-Repeat approach to ensure constraints are met Stars
noamgat/lm-format-enforcer Regex, JSON Schema, Beam Search etc. Stars
epfl-dlab/transformers-CFG CFG (EBNF Interface), Compatible with Transformers, Easy to extend for research Stars
uiuc-focal-lab/syncode CFG generation that supports builtin grammars like JSON, Python, Go, and more Stars

Disclaimer:

  • The libraries listed above are not exhaustive and are subject to change.
  • The features mentioned are 100% not exhaustive and I strongly recommend checking the respective repositories for more details.
  • The libraries are listed by the Github stars
  • If you are the author of a library and would like to add or update the information, please open an issue or submit a pull request.

Papers

Papers with are newly added papers (not necessarily newly published papers).

Date Paper Publication
2024-08 Let Me Speak Freely? A Study on the Impact of Format Restrictions on Performance of Large Language Models Arxiv
Dynamic JSON Badge
2024-08 FANTAstic SEquences and Where to Find Them: Faithful and Efficient API Call Generation through State-tracked Constrained Decoding and Reranking Arxiv
Dynamic JSON Badge
2024-07 Automata-based constraints for language model decoding CoLM
Dynamic JSON Badge
2024-06 Sketch-Guided Constrained Decoding for Boosting Blackbox Large Language Models without Logit Access ACL
Dynamic JSON Badge
2024-05 Grammar-Aligned Decoding Preprint
Dynamic JSON Badge
2024-03 SynCode: LLM Generation with Grammar Augmentation Arxiv
Dynamic JSON Badge
2024-03 Guiding LLMs The Right Way: Fast, Non-Invasive Constrained Generation ICML
Dynamic JSON Badge
2024-02 Constrained Decoding for Code Language Models via Efficient Left and Right Quotienting of Context-Sensitive Grammars Arxiv
Dynamic JSON Badge
2024-02 Formal-LLM: Integrating Formal Language and Natural Language for Controllable LLM-based Agents Arxiv
Dynamic JSON Badge
2023-12 SGLang: Efficient Execution of Structured Language Model Programs Preprint
Dynamic JSON Badge
2023-12 Monitor-Guided Decoding of Code LMs with Static Analysis of Repository Context NeurIPS
Dynamic JSON Badge
2023-11 Prompt Sketching for Large Language Models Preprint
Dynamic JSON Badge
2023-11 Sequential Monte Carlo Steering of Large Language Models using Probabilistic Programs PADL
Dynamic JSON Badge
2023-10 Don't Fine-Tune, Decode: Syntax Error-Free Tool Use via Constrained Decoding Arxiv
Dynamic JSON Badge
2023-10 Amortizing intractable inference in large language models ICLR
Dynamic JSON Badge
2023-10 KCTS: Knowledge-Constrained Tree Search Decoding with Token-Level Hallucination Detection EMNLP
Dynamic JSON Badge
2023-07 Efficient Guided Generation for Large Language Models Arxiv
Dynamic JSON Badge
2023-06 Grammar Prompting for Domain-Specific Language Generation with Large Language Models NeurIPS
Dynamic JSON Badge
2023-06 Grammar-Constrained Decoding for Structured NLP Tasks without Finetuning EMNLP
Dynamic JSON Badge
2023-06 Prompting Is Programming: A Query Language for Large Language Models PLDI
Dynamic JSON Badge
2023-05 Measuring and Mitigating Constraint Violations of In-Context Learning for Utterance-to-API Semantic Parsing EMNLP Findings
Dynamic JSON Badge
2023-04 Tractable Control for Autoregressive Language Generation ICML
Dynamic JSON Badge
2022-11 Validating Large Language Models with ReLM MLSys
Dynamic JSON Badge
2022-11 CodePAD: Sequence-based Code Generation with Pushdown Automaton ISSTA
Dynamic JSON Badge
2022-05 Gradient-Based Constrained Sampling from Language Models EMNLP
Dynamic JSON Badge
2022-01 Synchromesh: Reliable code generation from pre-trained language models ICLR
Dynamic JSON Badge
2021-12 PICARD: Parsing Incrementally for Constrained Auto-Regressive Decoding from Language Models EMNLP
Dynamic JSON Badge
2021-12 Constrained Language Models Yield Few-Shot Semantic Parsers EMNLP
Dynamic JSON Badge
2021-12 Controlled Text Generation as Continuous Optimization with Multiple Constraints NeurIPS
Dynamic JSON Badge
2021-06 NEUROLOGIC DECODING:(Un)supervised Neural Text Generation with Predicate Logic Constraints NAACL
Dynamic JSON Badge
2019-05 A General-Purpose Algorithm for Constrained Sequential Inference CoNLL
Dynamic JSON Badge
2019-05 Improved Lexically Constrained Decoding for Translation and Monolingual Rewriting NAACL
Dynamic JSON Badge
2018-09 CGMH: Constrained Sentence Generation by Metropolis-Hastings Sampling AAAI
Dynamic JSON Badge
2018-05 Fast Lexically Constrained Decoding with Dynamic Beam Allocation for Neural Machine Translation NAACL
Dynamic JSON Badge
2018-04 Incorporating Discriminator in Sentence Generation: a Gibbs Sampling Method AAAI
Dynamic JSON Badge
2017-12 Guided Open Vocabulary Image Captioning with Constrained Beam Search EMNLP
Dynamic JSON Badge
2017-06 Lexically Constrained Decoding for Sequence Generation Using Grid Beam Search ACL
Dynamic JSON Badge

Benchmark & Datasets & Evaluation

Date Paper Publication
2024-05 COLLIE: Systematic Construction of Constrained Text Generation Tasks ICLR
Dynamic JSON Badge
2023-12 BenchCLAMP: A Benchmark for Evaluating Language Models on Syntactic and Semantic Parsing NeurIPS Track on Datasets and Benchmarks
Dynamic JSON Badge
2023-10 Evaluating Large Language Models on Controlled Generation Tasks Arxiv
Dynamic JSON Badge
2023-09 Struc-Bench: Are Large Language Models Really Good at Generating Complex Structured Data? Arxiv
Dynamic JSON Badge
2020-12 CommonGen: A Constrained Text Generation Challenge for Generative Commonsense Reasoning EMNLP Findings
Dynamic JSON Badge

Survey

Date Paper Publication
2024-04 "We Need Structured Output": Towards User-centered Constraints on Large Language Model Output Arxiv
Dynamic JSON Badge

Blog Posts

Many of the blogs are written by Outlines team, many thanks to them for their great work! ❤️

Disclaimer

This list is not exhaustive and will be updated regularly. If you have any suggestions or want to add a paper, please feel free to open an issue or submit a pull request. We hope to include all relevant papers in this list.

Contributing

Contributions are welcome! Feel free to submit a pull request or open an issue. Please make sure to read the Contributing Guidelines before contributing.

About

A curated list of papers related to constrained decoding of LLM, along with their relevant code and resources.

License:MIT License