Wes Gurnee (wesg52)

wesg52

Geek Repo

Company:MIT

Location:Cambridge, MA

Home Page:https://www.wesg.me/

Twitter:@wesg52

Github PK Tool:Github PK Tool

Wes Gurnee's starred repositories

mamba

Mamba SSM architecture

Language:PythonLicense:Apache-2.0Stargazers:12696Issues:101Issues:511

kedro

Kedro is a toolbox for production-ready data science. It uses software engineering best practices to help you create data engineering and data science pipelines that are reproducible, maintainable, and modular.

Language:PythonLicense:Apache-2.0Stargazers:9866Issues:108Issues:1958

leafmap

A Python package for interactive mapping and geospatial analysis with minimal coding in a Jupyter environment

Language:PythonLicense:MITStargazers:3184Issues:59Issues:259

awesome-neural-geometry

A curated collection of resources and research related to the geometry of representations in the brain, deep networks, and beyond

representation-engineering

Representation Engineering: A Top-Down Approach to AI Transparency

Language:Jupyter NotebookLicense:MITStargazers:699Issues:28Issues:46

pyvene

Stanford NLP Python Library for Understanding and Improving PyTorch Models via Interventions

Language:PythonLicense:Apache-2.0Stargazers:605Issues:9Issues:60

redun

Yet another redundant workflow engine

Language:PythonLicense:Apache-2.0Stargazers:516Issues:15Issues:54

torchlens

Package for extracting and mapping the results of every single tensor operation in a PyTorch model in one line of code.

Language:PythonLicense:GPL-3.0Stargazers:461Issues:6Issues:18

cola

Compositional Linear Algebra

Language:PythonLicense:Apache-2.0Stargazers:401Issues:4Issues:43

SAELens

Training Sparse Autoencoders on Language Models

Language:Jupyter NotebookLicense:MITStargazers:383Issues:8Issues:92

nnsight

The nnsight package enables interpreting and manipulating the internals of deep learned models.

Language:Jupyter NotebookLicense:MITStargazers:367Issues:4Issues:74

inseq

Interpretability for sequence generation models 🐛 🔍

Language:PythonLicense:Apache-2.0Stargazers:362Issues:10Issues:82

Awesome-Interpretability-in-Large-Language-Models

This repository collects all relevant resources about interpretability in LLMs

world-models

Extracting spatial and temporal world models from LLMs

Language:Jupyter NotebookLicense:MITStargazers:233Issues:6Issues:4
Language:Jupyter NotebookLicense:MITStargazers:174Issues:0Issues:17

sparse_autoencoder

Sparse Autoencoder for Mechanistic Interpretability

Language:PythonLicense:MITStargazers:173Issues:4Issues:41

sae_vis

Create feature-centric and prompt-centric visualizations for sparse autoencoders (like those from Anthropic's published research).

Language:HTMLLicense:MITStargazers:137Issues:7Issues:20

Awesome-LLM-Interpretability

A curated list of LLM Interpretability related material - Tutorial, Library, Survey, Paper, Blog, etc..

sleeper-agents-paper

Contains random samples referenced in the paper "Sleeper Agents: Training Robustly Deceptive LLMs that Persist Through Safety Training".

devinterp

Tools for studying developmental interpretability in neural networks.

modeldiff

ModelDiff: A Framework for Comparing Learning Algorithms

Language:Jupyter NotebookLicense:MITStargazers:52Issues:4Issues:1

sparse-probing-paper

Sparse probing paper full code.

Language:Jupyter NotebookLicense:MITStargazers:47Issues:2Issues:2

universal-neurons

Universal Neurons in GPT2 Language Models

Language:Jupyter NotebookLicense:MITStargazers:25Issues:3Issues:2

elk-generalization

Investigating the generalization behavior of LM probes trained to predict truth labels: (1) from one annotator to another, and (2) from easy questions to hard

Language:PythonLicense:MITStargazers:24Issues:2Issues:1

edge-attribution-patching

Code for my NeurIPS 2024 ATTRIB paper titled "Attribution Patching Outperforms Automated Circuit Discovery"

Language:Jupyter NotebookStargazers:22Issues:2Issues:1
Language:Jupyter NotebookStargazers:7Issues:3Issues:0