Tianshuo Cong's starred repositories
JailbreakEval
A collection of automated evaluators for assessing jailbreak attempts.
do-not-answer
Do-Not-Answer: A Dataset for Evaluating Safeguards in LLMs
Adi-Red-Scene
Local Discriminative Regions for Scene Recognition (ACMMM 2018)
AI-Security-and-Privacy-Events
A curated list of academic events on AI Security & Privacy
lm-evaluation-harness
A framework for few-shot evaluation of language models.
Awesome-LM-SSP
A reading list for large models safety, security, and privacy (including Awesome LLM Security, Safety, etc.).
LLMs-Finetuning-Safety
We jailbreak GPT-3.5 Turbo’s safety guardrails by fine-tuning it on only 10 adversarially designed examples, at a cost of less than $0.20 via OpenAI’s APIs.
PyTorch_CIFAR10
Pretrained TorchVision models on CIFAR10 dataset (with weights)
TransferAttackEval
Revisiting Transferable Adversarial Images (arXiv)
Targeted-Transfer
Simple yet effective targeted transferable attack (NeurIPS 2021)
pytorch-lightning
Pretrain, finetune and deploy AI models on multiple GPUs, TPUs with zero code changes.
lightning-hydra-template
PyTorch Lightning + Hydra. A very user-friendly template for ML experimentation. ⚡🔥⚡