firstuserhere

firstuserhere's repositories

firstuserhere_test.github.io

Language:HTML300

gpt4Vadvanced

Testing GPT-4 Vision on Advanced examination questions (2023) across physics, chemistry, and mathematics

Language:JavaScript300

metalearning

This is a repository and github pages website deployment for my work on the mechanistic analysis of out-of-context meta-learning in LLMs

Language:SCSSMIT200

awesome-mech-interp

An awesome curated list of resources dedicated to Mechanistic interpretability

1 10

basic-scripts

a bunch of basic scripts hacked together but working and are maybe useful for me

Language:Jupyter Notebook1 10

firstuserhere.github.io

This is my website

Language:HTMLMIT100

multimodal-mechinterp

Basic mech interp analysis for some multimodal models

1 10

outofcontextnotes

This repository holds my notes and thoughts (always WIP) while doing work on the "out of context meta learning" project.

Language:RubyMIT100

replications

My attempts at replicating results of papers

100

running

Language:HTML100

activationsteering

010

aisc_oocl_experiments

experiments trying to elicit out of context learning when training a transformer on a simple task

000

ComPromptMized

ComPromptMized: Unleashing Zero-click Worms that Target GenAI-Powered Applications

000

countdowns

Language:JavaScript000

GPU-Puzzles

Solve puzzles. Learn CUDA.

Language:Jupyter NotebookMIT000

hydra_effect_replication

000

Improved-worldmodels

Critiques of the pre-print, suggestions for improvement, and counterfactual examples testing

Language:Jupyter NotebookMIT000

lit

The Learning Interpretability Tool: Interactively analyze ML models to understand their behavior in an extensible and framework agnostic interface.

Apache-2.0000

LLaVA-mechinterp

000

miras-sudoku-solution

Fork of a possible solution for testing

000

nanogenmo

National Novel Generation Month, 2023 edition.

000

practiceCUDA

010

sparse_autoencoder

Sparse Autoencoder for Mechanistic Interpretability

Language:PythonMIT000

SPARta

LLM experiments done during SERI MATS - focusing on activation steering / interpreting activation spaces

000

transformer-debugger

My fork of the original transformer Debugger library by openAI

Language:PythonMIT000

transformerperspectives

Looking at data through the perspective of different components of a transformer model

Language:RubyMIT000

visualize-SAE

Create feature-centric and prompt-centric visualizations for sparse autoencoders (like those from Anthropic's published research).

MIT000

ViT-Prisma

ViT Prisma is a mechanistic interpretability library for Vision Transformers (ViTs).

NOASSERTION000

weak-to-strong

MIT000

Whisper-mechinterp

Mechanistic Interpretability for Whisper

000