CUNY-CL / yoyodyne

Small-vocabulary sequence-to-sequence generation with optional feature conditioning

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Add self attention encoder

Adamits opened this issue · comments

With the decoupling of encoders and decoders, we have added a Linear encoder, which seems to just embed the inputs and pass them along. We should also add a SelfAttention encoder, which encodes the embeddings with a self attention layer (and no positional encoding).

This contextualizes the embeddings by representing each as a linear combination of itself wrt all other embeddings.

+1. Makes sense.