Aidan Ewart (Baidicoot)

Baidicoot

Geek Repo

Location:UK

Home Page:https://bayesteezian.net/

Github PK Tool:Github PK Tool


Organizations
piwars-rgs

Aidan Ewart's repositories

easy-sae-training

Easy training for Sparse Linear Autoencoders (https://arxiv.org/abs/2309.08600) with data from TransformerLens models.

Language:PythonStargazers:6Issues:0Issues:0

mini

typed successor to rpncalc

Language:HaskellLicense:GPL-3.0Stargazers:3Issues:2Issues:0
Language:Jupyter NotebookStargazers:1Issues:1Issues:0

sparse_coding

Work on sparse coding, replicating and extending the sparse coding approach to taking transformer features out of superposition.

Language:Jupyter NotebookStargazers:1Issues:0Issues:0

analysis_lean

A formalization of my analysis course, in lean.

Language:LeanStargazers:0Issues:0Issues:0
Language:Jupyter NotebookStargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0

automated-interpretability-mistral

Getting OpenAI autointerp to work with locally-run (finetuned) Mistral-7B instances.

Language:PythonStargazers:0Issues:0Issues:0

bijou

Another compiler for a functional programming language, this time hopefully using LLVM in the backend.

Language:HaskellStargazers:0Issues:0Issues:0

group_projects

uni group projects homework (probably mostly bad TeX files)

Language:TeXStargazers:0Issues:0Issues:0
Language:TeXStargazers:0Issues:1Issues:0

latent-adverserial-training

Experiments with LAT using activation addition vectors.

License:Apache-2.0Stargazers:0Issues:0Issues:0

lynn

An implementation of a linear type theory with uniqueness types (I think in a similar style to McBride's work, literature is hard)

Language:HaskellStargazers:0Issues:0Issues:0
Language:Jupyter NotebookStargazers:0Issues:0Issues:0

mechanistic-unlearning

Machine Unlearning via pruning/circuit discovery

License:Apache-2.0Stargazers:0Issues:0Issues:0

othello_world_ppo

Emergent world representations: Exploring a sequence model trained on a synthetic task

License:MITStargazers:0Issues:0Issues:0

Polygraph

RLHF Mechanistic Interpretability and Deception

License:MITStargazers:0Issues:0Issues:0
Language:PythonStargazers:0Issues:0Issues:0
Language:PythonStargazers:0Issues:0Issues:0

rlaif-jailbreaking

Self-improving PAIR using RLAIF and MCTS.

Language:PythonStargazers:0Issues:0Issues:0

sae-alternatives

evaluating alternatives to boring linear sparse autoencoders for latent disentanglement

Stargazers:0Issues:0Issues:0

scratch-transformer

oh no im doing ml

Language:PythonStargazers:0Issues:0Issues:0

sdl-steering

A collection of experiments trying to evaluate how useful sparse dictionary learning (SDL) methods are for model steering (i.e. identifying 'important components of feature representations').

Language:PythonLicense:GPL-3.0Stargazers:0Issues:0Issues:0

set-theory-prover

For an AQA A-Level computer science NEA project

Language:HaskellStargazers:0Issues:0Issues:0

soviet-language

forth, but all functions are global to the entire internet

Language:JavaScriptStargazers:0Issues:3Issues:0

sparse_autoencoder

Sparse Autoencoder for Mechanistic Interpretability

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

switchcraft-stuff

Random stuff probably for switchcraft.

Language:LuaStargazers:0Issues:0Issues:0
Language:PythonLicense:MITStargazers:0Issues:0Issues:0
Language:PythonStargazers:0Issues:0Issues:0
Language:PythonLicense:MITStargazers:0Issues:0Issues:0