jamesmf / cclm

Composable Character Level Models

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Make preprocessor aware of downsampling

jamesmf opened this issue · comments

Transformer layers are costly with respect to input length, and that is particularly a problem with character-level models. One option is to reduce the sequence length with strided convolutions or strided pooling before the transformer layer(s), then upsample afterward.

To make this pattern more straightforward across multiple components, the Preprocessor can be make aware of the downsample_factor and make sure that inputs are padded appropriately to make the upsampling the same shape.

As part of this implementation, the Preprocessor should be able to make the length on string_to_array optional