JasonGross / guarantees-based-mechanistic-interpretability

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

inline relevant fields of HookedTransformer rather than nesting it in config

JasonGross opened this issue · comments

I think we made the wrong decision (/ I gave bad design advice) when making HookedTransformerConfig a field of each of the experiments. I think on reflection the downsides outweigh the upsides.

Upsides:

  1. uniform way of adding CLI arguments (the utility functions can be replaced by a lookup table of the argparse arugments corresponding to each HookedTransformerConfig field)
  2. enforced uniform naming scheme

Downsides:

  1. upgrading HookedTransformer invalidates all of our config hashes / wandb models (see also #52 and #40 (comment))
  2. we have to introduce kludges when we want more control over renaming and defaulting arguments, e.g., when we want to be based on sequence length rather than context window size, or use a prime p rather than d_vocab_out, etc.