graykode / ALBERT-Pytorch

Pytorch Implementation of ALBERT(A Lite BERT for Self-supervised Learning of Language Representations)

Home Page:https://arxiv.org/pdf/1909.11942.pdf

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Number of Transformer layers

IwasakiYuuki opened this issue · comments

Thank you for implementation of ALBERT by pytorch.
I have a question about the model construction.

h = self.transformer(input_ids, segment_ids, input_mask)

Why is the number of transformer layers one ?
Is this correct ?

Please see

for _ in range(self.n_layers):

I hope this code hell you, Thanks

sorry, i missed it...
Thank you for answering to my question.