Fabien Roger's repositories
Countergen
A framework for generating counterfactual datasets, evaluating NLP models, and editing models to reduce bias
llm-attacks
Universal and Transferable Attacks on Aligned Language Models
Distributed-Text-Search
Brute-force approximate match search - parallelized using MPI, OpenMP and Cuda.
alpaca_eval
An automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.
Language:Jupyter NotebookApache-2.0000
circuit-breakers
Improving Alignment and Robustness with Circuit Breakers
Language:Jupyter Notebook000
Language:Python000
Language:PythonMIT000
Language:Jupyter Notebook000
Language:Python000
Language:HTML000
ODIN-Extension
A simple and effective method for detecting out-of-distribution images in neural networks.
Language:PythonNOASSERTION000
othello_playground
Emergent world representations: Exploring a sequence model trained on a synthetic task
Language:Jupyter NotebookMIT000
Language:PythonMIT000
Language:HTML000
Language:TypeScriptMIT000