firstuserhere

firstuserhere's repositories

firstuserhere_test.github.io

Language:HTML3 10

gpt4Vadvanced

Testing GPT-4 Vision on Advanced examination questions (2023) across physics, chemistry, and mathematics

Language:JavaScript3 10

metalearning

This is a repository and github pages website deployment for my work on the mechanistic analysis of out-of-context meta-learning in LLMs

Language:SCSSMIT200

awesome-mech-interp

An awesome curated list of resources dedicated to Mechanistic interpretability

1 10

basic-scripts

a bunch of basic scripts hacked together but working and are maybe useful for me

Language:Jupyter Notebook1 10

firstuserhere.github.io

This is my website

Language:HTMLMIT1 10

multimodal-mechinterp

Basic mech interp analysis for some multimodal models

1 10

neurips-workshops-2024

I wanted to have the neurips workshops organized neatly so created this page

100

outofcontextnotes

This repository holds my notes and thoughts (always WIP) while doing work on the "out of context meta learning" project.

Language:RubyMIT100

running

Language:HTML100

activationsteering

010

aisc_oocl_experiments

experiments trying to elicit out of context learning when training a transformer on a simple task

Language:Python000

ComPromptMized

ComPromptMized: Unleashing Zero-click Worms that Target GenAI-Powered Applications

000

countdowns

Language:JavaScript000

GPU-Puzzles

Solve puzzles. Learn CUDA.

Language:Jupyter NotebookMIT000

hydra_effect_replication

010

Improved-worldmodels

Critiques of the pre-print, suggestions for improvement, and counterfactual examples testing

Language:Jupyter NotebookMIT000

lit

The Learning Interpretability Tool: Interactively analyze ML models to understand their behavior in an extensible and framework agnostic interface.

Apache-2.0000

LLaVA-mechinterp

000

miras-sudoku-solution

Fork of a possible solution for testing

000

nanogenmo

National Novel Generation Month, 2023 edition.

000

practiceCUDA

010

sparse_autoencoder

Sparse Autoencoder for Mechanistic Interpretability

Language:PythonMIT000

SPARta

LLM experiments done during SERI MATS - focusing on activation steering / interpreting activation spaces

000

transformer-debugger

My fork of the original transformer Debugger library by openAI

Language:PythonMIT000

transformerperspectives

Looking at data through the perspective of different components of a transformer model

Language:RubyMIT010

visualize-SAE

Create feature-centric and prompt-centric visualizations for sparse autoencoders (like those from Anthropic's published research).

Language:HTMLMIT000

ViT-Prisma

ViT Prisma is a mechanistic interpretability library for Vision Transformers (ViTs).

NOASSERTION000

weak-to-strong

Language:PythonMIT000

Whisper-mechinterp

Mechanistic Interpretability for Whisper

010