stanford-crfm / mistral

Mistral: A strong, northwesterly wind: Framework for transparent and accessible large-scale language model training, built with Hugging Face 🤗 Transformers.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

revisit OnlineBenchmarkTrainer

dlwh opened this issue · comments

Describe the solution you'd like
OnlineBenchmarkTrainer is overriding methods that aren't designed to be overridden, which can lead to issues with upstream divergences.

OnlineBenchmarkTrainer has two methods:

  • _maybe_log_save_evaluate
  • _get_train_sampler

From #114 we can just about get rid of _get_train_sampler (it depends on transformers master for now though)

But there's more work to be done to get rid of _maybe_log_save_evaluate. I think these should be able to be replaced with callbacks, but it needs research.

Additional context
Related to #121 , #114

ok, i think we still need this class, though we've removed most of the cruft-y bits