There are 2 repositories under sparse-autoencoder topic.
A complete end-to-end pipeline for LLM interpretability with sparse autoencoders (SAEs) using Llama 3.2, written in pure PyTorch and fully reproducible.
This repository collects all relevant resources about interpretability in LLMs
Implementation of the stacked denoising autoencoder in Tensorflow
Pytorch implementations of various types of autoencoders
Tensorflow Examples
Experiments with Adversarial Autoencoders using Keras
Sparse Auto Encoder and regular MNIST classification with mini batch's
Multi-Layer Sparse Autoencoders
Repository of Deep Propensity Network - Sparse Autoencoder(DPN-SA) to calculate propensity score using sparse autoencoder
Providing the answer to "How to do patching on all available SAEs on GPT-2?". It is an official repository of the implementation of the paper "Evaluating Open-Source Sparse Autoencoders on Disentangling Factual Knowledge in GPT-2 Small"
Explore visualization tools for understanding Transformer-based large language models (LLMs)
Collection of autoencoder models in Tensorflow
A tiny easily hackable implementation of a feature dashboard.
Implemented semi-supervised learning for digit recognition using Sparse Autoencoder
A resource repository of sparse autoencoders for large language models
Steering GPT2-EMGSD less biased & Generating stereotyped text with vanilla GPT2 without fine tuning or prompt engineering
Neural Network Architcture | ISI Kolkata
This repository contains Python codes for Autoenncoder, Sparse-autoencoder, HMM, Expectation-Maximization, Sum-product Algorithm, ANN, Disparity map, PCA.
Implement a sparse autoencoder on the bot-iot dataset for dimensionality reduction followed by computation of reconstruction error, F1 score, recall, accuracy, weights, and threshold amongst other metrics
Folder contains implementation of Multi layer feed forward networks, Autoencoders, Sparse Autoencoders and many..
Interpret and control dense embedding via sparse autoencoder.
Sparse Autoencoder based on the Unsupervised Feature Learning and Deep Learning tutorial from the Stanford University