salesforce / CodeGen

CodeGen is an open-source model for program synthesis. Trained on TPU-v4. Competitive with OpenAI Codex.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

why dropouts are 0 for codegen-350M-mono?

wasiahmad opened this issue · comments


I noticed in the config file ( that:

"attn_pdrop": 0.0
"embd_pdrop": 0.0
"resid_pdrop": 0.0

Is codegen pretrained with dropout 0? @enijkamp


In training, the model does not include drop-out regularization, hence dropout of 0.0 in the converted PyTorch forward pass.

@enijkamp What was the reason of not using any dropout? Want to learn if there is any insight. Thanks!