AI Secure's repositories
DecodingTrust
A Comprehensive Assessment of Trustworthiness in GPT Models
Certified-Robustness-SoK-Oldver
This repo keeps track of popular provable training and verification approaches towards robust neural networks, including leaderboards on popular datasets and paper categorization.
FLBenchmark-toolkit
Federated Learning Framework Benchmark (UniFed)
Robustness-Against-Backdoor-Attacks
RAB: Provable Robustness Against Backdoor Attacks
semantic-randomized-smoothing
[CCS 2021] TSS: Transformation-specific smoothing for robustness certification
adversarial-glue
[NeurIPS 2021] "Adversarial GLUE: A Multi-Task Benchmark for Robustness Evaluation of Language Models" by Boxin Wang*, Chejian Xu*, Shuohang Wang, Zhe Gan, Yu Cheng, Jianfeng Gao, Ahmed Hassan Awadallah, Bo Li.
DPFL-Robustness
[CCS 2023] Unraveling the Connections between Privacy and Certified Robustness in Federated Learning Against Poisoning Attacks
Certified-Fairness
Code for Certifying Some Distributional Fairness with Subpopulation Decomposition [NeurIPS 2022]
helm
Holistic Evaluation of Language Models (HELM), a framework to increase the transparency of language models (https://arxiv.org/abs/2211.09110).
hf-blog
Public repo for HF blog posts