salesforce / CodeGen

CodeGen is a family of open-source model for program synthesis. Trained on TPU-v4. Competitive with OpenAI Codex.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

why dropouts are 0 for codegen-350M-mono?

wasiahmad opened this issue · comments

Hi,

I noticed in the config file (https://huggingface.co/Salesforce/codegen-350M-mono/blob/main/config.json) that:

"attn_pdrop": 0.0
"embd_pdrop": 0.0
"resid_pdrop": 0.0

Is codegen pretrained with dropout 0? @enijkamp

Yes.

In training, the model does not include drop-out regularization, hence dropout of 0.0 in the converted PyTorch forward pass.

@enijkamp What was the reason of not using any dropout? Want to learn if there is any insight. Thanks!