Choco31415 / Attention_Network_With_Keras

An example attention network with simple dataset.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

What is the role of Dense(x,yt-1)?

binaryOmaire opened this issue · comments

What are x,y,t-1 in Dense(x,yt-1)?

The goal of an attention layer is to select the important parts of the input to consider for generating output. The input data itself impacts its importance, hence it is input to calculate context.

Additionally, the network's context should vary over time. Only inputting x means the context will be static. As such, this tutorial has a RNN layer on top of the attention mechanism, the output of which is mildly confusingly called y. y(t-1) is the previous RNN output, and having this input for context provides a time varying mechanism for the AI to shift it's attention over time.

Does this answer the question?