There are 0 repository under interpretability-jam topic.
🧠 Starter templates for doing interpretability research
🦠 DeepDecipher: An open source API to MLP neurons
Mechanistic Interpretability Tutorials, Results and research log as I learn from publicly available research, and experimentation.
This Alignment Jam Hackathon project explores whether the concept of "logit lens" applies to the encoder and decoder layers in Whisper, an end-to-end speech recognition model.