teapotliid's starred repositories
PrivLM-Bench
Code for ACL 2024 paper: PrivLM-Bench: A Multi-level Privacy Evaluation Benchmark for Language Models.
Analytic-continual-learning
This repository will be posting analytic continual learning series, including Analytic Class-Incremental Learning (ACIL), Gaussian Kernel Embedded Analytic Learning (GKEAL), Dual-Stream Analytic Learning (DS-AL), etc.
sleeper-agents-paper
Contains random samples referenced in the paper "Sleeper Agents: Training Robustly Deceptive LLMs that Persist Through Safety Training".
OPPO-Ontology
Privacy Policy Analysis Ontology to Analyze the Depth of the Data Practices Defined in the Privacy Policy Text.
LLM-Multistep-Jailbreak
Code for Findings-EMNLP 2023 paper: Multi-step Jailbreaking Privacy Attacks on ChatGPT
Persona_leakage_and_defense_in_GPT-2
Code for NAACL 2022 paper "You Don’t Know My Favorite Color: Preventing Dialogue Representations from Revealing Speakers’ Private Personas"
MATH4432_Tutorial
MATH4432 Tutorial