UNC1739 / awesome-adversarial-ml

Overview

Adversarial Machine Learning: A Taxonomy and Terminology of Attacks and Mitigations
https://nvlpubs.nist.gov/nistpubs/ai/NIST.AI.100-2e2023.pdf

ChatGPT's training data can be exposed via a "divergence attack"
https://stackdiary.com/chatgpts-training-data-can-be-exposed-via-a-divergence-attack/https://stackdiary.com/chatgpts-training-data-can-be-exposed-via-a-divergence-attack/

Extracting Training Data from ChatGPT
https://not-just-memorization.github.io/extracting-training-data-from-chatgpt.html

OWASP Top 10 for LLM Applications
https://owasp.org/www-project-top-10-for-large-language-model-applications/

Model Confusion: Weaponizing ML Models for Red Teams and Bounty Hunters
https://5stars217.github.io/2023-08-08-red-teaming-with-ml-models/#injecting-malware-into-a-keras--tensorflow-model-architecture

Vulnerable LLMs

Damn Vulnerable LLM Agent https://github.com/WithSecureLabs/damn-vulnerable-llm-agent

About