Michael J Clark's repositories
viz_torch_optim
Videos of deep learning optimizers moving on 3D problem-landscapes
awesome-interpretability
Awesome tools for interpreting, manipulating the internals of of deep neural networks.
adapters_can_monitor_lies
inspired by circuit breakers paper. honesty>harmless
cookiecutter-data-science
A logical, reasonably standardized, but flexible project structure for doing and sharing data science work.
detect_bs_text
Can we measure how good a text is by how much an LLM learns from it?
open_pref_eval
Hackable, simple, llm evals on preference datasets
scrape_r_rational
scraping book reccomendations from reddit r rational
side-by-side
Visual comparison of different translations of itemized texts; e.g. poems, bibles, etc.
activation_store
Store transformer activations on disk
eliciting_suppressed_knowledge
probing suppressed activation gives improvements on TruthfulQA
lie_elicitation_prompts
Research dataset. We use prompts to get LLM's to lie. Using sys prompts and multi shot examples
repr-preference-optimization
align inner states not actions for better generalization? [wip]
abliterator
Abliterator with baukit, not transformerlens
chatGPTBox
Integrating ChatGPT into your browser deeply, everything you need is here
circuit-breakers
Improving Alignment and Robustness with Circuit Breakers
coconut
Training Large Language Model to Reason in a Continuous Latent Space
GENIES
Generalization Analogies: A Testbed for Generalizing AI Oversight to Hard-To-Measure Domains
machiavelli_as_ds
turn "MACHIAVELLI Benchmark" into a normal dataset of trajectory nodes, choices, and labels
OpenCoconut
OpenCoconut implements a latent reasoning paradigm where we generate thoughts before decoding.
optuna-dashboard
Real-time Web Dashboard for Optuna.
rag_search_cite
Hackable frontend for LLM assisted searching with citations
SimPO
SimPO: Simple Preference Optimization with a Reference-Free Reward
vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
word_level_diff_writing_assistant
Spell check with an llm and quickly verify it with a word level diff
xbsjsonedit
A basic editor for xBrowserSync json backup files