airsplay / R2R-EnvDrop

PyTorch Code of NAACL 2019 paper "Learning to Navigate Unseen Environments: Back Translation with Environmental Dropout"

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

About angle feature size

HanqingWangAI opened this issue · comments

Hi @airsplay , I note that the angle feature size is 4 in both the paper and the default parameter setting but 128 in run\agent.bash.
Is this a special design? :)

Hi, @qweas120 .

We follow the original implementation of panoramic view (speaker-follower) in doing this. Although the paper takes a 4-dimensional vector for simplicity

... a 4-dimensional orientation feature [sin ψ; cos ψ; sin θ; cos θ]...

, the code uses an 128-dim embedding:

    embedding = np.zeros((len(adj_loc_list), feature_dim + 128), np.float32)

It was a solution to balance the energy of two features (i.e., ResNet features and positional features) by repeating the smaller one.

By the way, in our EMNLP 2019 paper, we used Layer_Normalization to address the imbalanced energy of ResNet feature and positional feature.
Instead of concatenating,

f' = [f, p]

we added the LayerNorm of two features (following the practice in the Transformer but there are some differences):

f' = LayerNorm(A f) + LayerNorm(B p)

Please kindly check Eqn 1 in Sec 2.1. We have not tested the LayerNorm technique on VLN thus I am not sure whether it is also applicable here.

@airsplay Wonderful reply. Thanks.