SPY Lab (ethz-spylab)

ethz-spylab

Geek Repo

0

followers

0

following

0

stars

Location:Switzerland

Home Page:https://spylab.ai

Github PK Tool:Github PK Tool

SPY Lab's repositories

rlhf_trojan_competition

Finding trojans in aligned LLMs. Official repository for the competition hosted at SaTML 2024.

Language:PythonLicense:Apache-2.0Stargazers:100Issues:4Issues:7

agentdojo

A Dynamic Environment to Evaluate Attacks and Defenses for LLM Agents.

Language:Jupyter NotebookLicense:MITStargazers:48Issues:5Issues:1

rlhf-poisoning

Code for paper "Universal Jailbreak Backdoors from Poisoned Human Feedback"

Language:PythonLicense:Apache-2.0Stargazers:39Issues:3Issues:6

diffusion_denoised_smoothing

Certified robustness "for free" using off-the-shelf diffusion models and classifiers

Language:PythonLicense:MITStargazers:34Issues:2Issues:4
Language:PythonLicense:MITStargazers:27Issues:0Issues:0

satml-llm-ctf

Code used to run the platform for the LLM CTF colocated with SaTML 2024

Language:PythonLicense:MITStargazers:23Issues:9Issues:55

realistic-adv-examples

Code for the paper "Evading Black-box Classifiers Without Breaking Eggs" [SaTML 2024]

Language:PythonLicense:MITStargazers:19Issues:2Issues:1

lm_memorization_data

Data for "Quantifying Memorization Across Neural Language Models"

License:Apache-2.0Stargazers:7Issues:0Issues:2

lm-extraction-benchmark-data

Datasets for the SATML 2023 competition on training data extraction

License:Apache-2.0Stargazers:5Issues:0Issues:1

misleading-privacy-evals

Official code for "Evaluations of Machine Learning Privacy Defenses are Misleading" (https://arxiv.org/abs/2404.17399)

Language:Jupyter NotebookStargazers:3Issues:0Issues:0
Language:PythonStargazers:1Issues:0Issues:0
Stargazers:0Issues:0Issues:0

data-decay

Playing around with the CC3M data

Language:PythonStargazers:0Issues:1Issues:0
Language:PythonStargazers:0Issues:0Issues:0

privacy

Library for training machine learning models with privacy for training data

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0
Language:PythonStargazers:0Issues:1Issues:0