A curated list of LLMs security research
papers.
Paper are divided into three categories: motivation, possible dataset and methods design and sorted by their released dates in descending order.
Motivation: Link
Text Anonymization: Link
Style Transfer: Link
Prompt attack: Link
Jailbreak attack: Link
De-identification and anonymization of medical reports: Link
Prompt Learning Link
Text Generation Link
Year
Title
Conference
Field
Venue
Paper Link
Code Link
2023
Beyond Memorization: Violating Privacy Via Inference with Large Language Models
unknown
unknown
arXiv
Link
unknown
2023
DECODINGTRUST: A Comprehensive Assessment of Trustworthiness in GPT Models
NeurIPS
unknown
arXiv
Link
Link
Year
Title
Conference
Field
Venue
Paper Link
Code Link
2021
ADePT: Auto-encoder based Differentially Private Text Transformation
ACL2021
DP(sentence level)
arxiv
Link
Link
2022
DP-VAE: Human-Readable Text Anonymization for Online Reviews with Differentially Private Variational Autoencoders
WWW’22
DP(sentence level)
ACM
Link
unknown
2022
A Customized Text Sanitization Mechanism with Differential Privacy
ACL2023
DP(token level)
arxiv
Link
Link
2023
Reducing Privacy Risks in Online Self-Disclosures with Language Models
unknown
Finetune
arxiv
Link
unknown
2023
DP-BART for Privatized Text Rewriting under Local Differential Privacy
ACL2023
DP(sentence level)
arxiv
Link
Link
2023
Locally Differentially Private Document Generation UsingZero Shot Prompting
EMNLP
DP(word level)
arxiv
Link
Link
2023
InferDPT: Privacy-preserving Inference for Black-box Large Language Models
unknown
DP(token level)
arxiv
Link
unknown
2023
Differentially Private Natural Language Models: Recent Advances and Future Directions
unknown
survey
arxiv
Link
unknown
2023
DEFENDING AGAINST AUTHORSHIP IDENTIFICATION ATTACKS
unknown
survey
arxiv
Link
unknown
Year
Title
Conference
Field
Venue
Paper Link
Code Link
2018
Delete, Retrieve, Generate:A Simple Approach to Sentiment and Style Transfer
ACL
unknown
ACL
Link
Link
2019
Disentangled Representation Learning for Non-Parallel Text Style Transfer
ACL
Disentangle
ACL
Link
Link
Year
Title
Conference
Field
Venue
Paper Link
Code Link
2023
Prompt Injection Attacks and Defenses in LLM-Integrated Applications
unknown
unknown
arXiv
Link
Link
2023
PromptBench: Towards Evaluating the Robustness of Large Language Models on Adversarial Prompts
unknown
unknown
arXiv
Link
Link
De-identification and anonymization of medical reports [Back to Top ]
Year
Title
Conference
Field
Venue
Paper Link
Code Link
2023
Are Chatbots Ready for Privacy-Sensitive Applications? An Investigation into Input Regurgitation and Prompt-Induced Sanitization
unknown
unknown
arXiv
Link
unknown
2023
DeID-GPT: Zero-shot Medical Text De-Identification by GPT-4
unknown
unknown
arXiv
Link
unknown
Year
Title
Conference
Field
Venue
Paper Link
Code Link
2023
ON THE SAFETY OF OPEN-SOURCED LARGE LANGUAGE MODELS: DOES ALIGNMENT REALLY PREVENT THEM FROM BEING MISUSED?
unknown
Jailbreak
arXiv
Link
unknown
2023
Do Anything Now”: Characterizing and Evaluating In-The-Wild Jailbreak Prompts on Large Language Models
unknown
unknown
arXiv
unknown
2023
MASTERKEY: Automated Jailbreaking of Large Language Model Chatbots
unknown
unknown
arXiv
Link
unknown
2023
Universal and Transferable Adversarial Attacks on Aligned Language Models
unknown
unknown
arXiv
Link
unknown
2023
SneakyPrompt: Jailbreaking Text-to-image Generative Models
IEEE S&P 2024
unknown
arXiv
Link
Link
2024
How Johnny Can Persuade LLMs to Jailbreak Them: Rethinking Persuasion to Challenge AI Safety by Humanizing LLMs
unknown
unknown
unknown
Link
unknown
Year
Title
Conference
Field
Venue
Paper Link
Code Link
2021
The Power of Scale for Parameter-Efficient Prompt Tuning
unknown
unknown
arXiv
Link
unknown
2023
You Only Prompt Once: On the Capabilities of Prompt Learning on Large Language Models to Tackle Toxic Content?
S&P
unknown
arXiv
Link
Link
Year
Title
Conference
Type
Venue
Paper Link
Code Link
2021
Pretrained Language Models for Text Generation: A Survey
ACM
survey
arxiv
Link
unknown
2021
Automatic text summarization: A comprehensive survey
Expert Syst. Appl. (2021)
survey
ScienceDirect
Link
unknown
2022
ParaDetox: Detoxification with Parallel Data
ACL
paper
acl
Link
unknown
2023
A Systematic survey on automated text generation tools and techniques: application, evaluation, and challenges
MULTIMED TOOLS APPL
survey
springer
Link
unknown