ethz-spylab

SPY Lab's repositories

Finding trojans in aligned LLMs. Official repository for the competition hosted at SaTML 2024.

Language:PythonApache-2.0100 4 7

A Dynamic Environment to Evaluate Attacks and Defenses for LLM Agents.

Language:Jupyter NotebookMIT48 5 1

Code for paper "Universal Jailbreak Backdoors from Poisoned Human Feedback"

Language:PythonApache-2.039 3 6

Certified robustness "for free" using off-the-shelf diffusion models and classifiers

Language:PythonMIT34 2 4

Language:PythonMIT27 3 1

Language:PythonMIT2700

Code used to run the platform for the LLM CTF colocated with SaTML 2024

Language:PythonMIT23 9 55

Code for the paper "Evading Black-box Classifiers Without Breaking Eggs" [SaTML 2024]

Language:PythonMIT19 2 1

Data for "Quantifying Memorization Across Neural Language Models"

Apache-2.0702

Datasets for the SATML 2023 competition on training data extraction

Apache-2.0501

Official code for "Evaluations of Machine Learning Privacy Defenses are Misleading" (https://arxiv.org/abs/2404.17399)

Language:Jupyter Notebook300

Language:Python100

000

Playing around with the CC3M data

Language:Python010

Language:Python000

Library for training machine learning models with privacy for training data

Language:PythonApache-2.0000

Language:Python010