Can adaptive computation time and self-consistency bootstrap continual learning?
- Does ACT affect generalization capability?
- Composition, Modularity, Rule vs. Exemplar-Based
- What is the effect of task diversity?
- Curriculum learning
- "meta-grokking"
- Meta-learning?
- Escaping local minimuma in energy landscape w/ noise injection?