Tianshuo Cong (tianshuocong)

tianshuocong

Geek Repo

Company:Tsinghua University

Location:Beijing, China

Home Page:https://tianshuocong.github.io/

Github PK Tool:Github PK Tool

Tianshuo Cong's starred repositories

JailbreakEval

A collection of automated evaluators for assessing jailbreak attempts.

Language:PythonLicense:MITStargazers:30Issues:0Issues:0
License:MITStargazers:3329Issues:0Issues:0

LAS-AT

Code for LAS-AT: Adversarial Training with Learnable Attack Strategy (CVPR2022)

Language:PythonStargazers:101Issues:0Issues:0

VLGuard

[ICML 2024] Safety Fine-Tuning at (Almost) No Cost: A Baseline for Vision Large Language Models.

Language:PythonStargazers:22Issues:0Issues:0
Language:TeXLicense:MITStargazers:5Issues:0Issues:0

do-not-answer

Do-Not-Answer: A Dataset for Evaluating Safeguards in LLMs

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:123Issues:0Issues:0

hh-rlhf

Human preference data for "Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback"

License:MITStargazers:1500Issues:0Issues:0

Adi-Red-Scene

Local Discriminative Regions for Scene Recognition (ACMMM 2018)

Language:PythonLicense:MITStargazers:22Issues:0Issues:0

AI-Security-and-Privacy-Events

A curated list of academic events on AI Security & Privacy

License:MITStargazers:119Issues:0Issues:0

lm-evaluation-harness

A framework for few-shot evaluation of language models.

Language:PythonLicense:MITStargazers:5807Issues:0Issues:0
Language:Jupyter NotebookStargazers:9Issues:0Issues:0

Awesome-LM-SSP

A reading list for large models safety, security, and privacy (including Awesome LLM Security, Safety, etc.).

License:Apache-2.0Stargazers:572Issues:0Issues:0

LLMs-Finetuning-Safety

We jailbreak GPT-3.5 Turbo’s safety guardrails by fine-tuning it on only 10 adversarially designed examples, at a cost of less than $0.20 via OpenAI’s APIs.

Language:PythonLicense:MITStargazers:202Issues:0Issues:0

FigStep

Jailbreaking Large Vision-language Models via Typographic Visual Prompts

Language:PythonLicense:MITStargazers:59Issues:0Issues:0
Language:PythonStargazers:14Issues:0Issues:0

baadd

Code for Backdoor Attacks Against Dataset Distillation

Language:PythonLicense:Apache-2.0Stargazers:30Issues:0Issues:0

cinic-10

A drop-in replacement for CIFAR-10.

Language:Jupyter NotebookLicense:MITStargazers:232Issues:0Issues:0

TePA

[S&P'24] Test-Time Poisoning Attacks Against Test-Time Adaptation Models

Language:PythonStargazers:14Issues:0Issues:0

Lion

Code for "Lion: Adversarial Distillation of Proprietary Large Language Models (EMNLP 2023)"

Language:PythonLicense:MITStargazers:193Issues:0Issues:0
Language:PythonLicense:MITStargazers:127Issues:0Issues:0
Language:PythonStargazers:8Issues:0Issues:0
Language:PythonLicense:MITStargazers:42Issues:0Issues:0

MART

Modular Adversarial Robustness Toolkit

Language:PythonLicense:BSD-3-ClauseStargazers:16Issues:0Issues:0

PyTorch_CIFAR10

Pretrained TorchVision models on CIFAR10 dataset (with weights)

Language:PythonLicense:MITStargazers:621Issues:0Issues:0

TransferAttackEval

Revisiting Transferable Adversarial Images (arXiv)

Language:PythonStargazers:110Issues:0Issues:0

Targeted-Transfer

Simple yet effective targeted transferable attack (NeurIPS 2021)

Language:PythonLicense:MITStargazers:47Issues:0Issues:0

DUA

The Norm Must Go On: Dynamic Unsupervised Domain Adaptation by Normalization (CVPR 2022)

Language:PythonStargazers:54Issues:0Issues:0

ML-Doctor

Code for ML Doctor

Language:PythonLicense:Apache-2.0Stargazers:78Issues:0Issues:0

pytorch-lightning

Pretrain, finetune and deploy AI models on multiple GPUs, TPUs with zero code changes.

Language:PythonLicense:Apache-2.0Stargazers:27479Issues:0Issues:0

lightning-hydra-template

PyTorch Lightning + Hydra. A very user-friendly template for ML experimentation. ⚡🔥⚡

Language:PythonStargazers:3849Issues:0Issues:0