length-generalization

There are 0 repository under length-generalization topic.

zdxdsw / inductive_counting_with_LMs
This work provides extensive empirical results on training LMs to count. We find that while traditional RNNs trivially achieve inductive counting, Transformers have to rely on positional embeddings to count out-of-domain. Modern RNNs (e.g. rwkv, mamba) also largely underperform traditional RNNs in generalizing counting inductively.
inductive-biases inductive-counting language-model-architectures length-generalization
Language:Jupyter Notebook 3

zdxdsw / inductive_counting_with_LMs