转换为transformers的格式报错
trueto opened this issue · comments
英文原版的可以转换
转换脚本:https://github.com/huggingface/transformers/blob/electra/src/transformers/convert_electra_original_tf_checkpoint_to_pytorch.py
报错信息:
Traceback (most recent call last): packages/transformers/model_electra/modeling_electra.py", line 102, in load_tf_weights_in_electra assert pointer.shape == array.shape, original_name AssertionError: ('electra/embeddings/LayerNorm/beta', torch.Size([128]), (768,))
HuggingFace-Transformers对目前ELECTRA的支持是beta版状态。
此时你可以调用 #1 中提供的json文件转换权重。
命令:
python convert_electra_original_tf_checkpoint_to_pytorch.py --tf_checkpoint_path base_model --config_file ./electra_config.json --pytorch_dump_path base.bin --discriminator_or_generator discriminator
其中small模型的electra_config.json为:
{
"attention_probs_dropout_prob": 0.1,
"directionality": "bidi",
"hidden_act": "gelu",
"hidden_dropout_prob": 0.1,
"hidden_size": 256,
"initializer_range": 0.02,
"intermediate_size": 1024,
"max_position_embeddings": 512,
"num_attention_heads": 4,
"num_hidden_layers": 12,
"pooler_fc_size": 768,
"pooler_num_attention_heads": 12,
"pooler_num_fc_layers": 3,
"pooler_size_per_head": 128,
"pooler_type": "first_token_transform",
"vocab_size": 21128,
"embedding_size": 128
}
base模型为:
{
"attention_probs_dropout_prob": 0.1,
"directionality": "bidi",
"hidden_act": "gelu",
"hidden_dropout_prob": 0.1,
"hidden_size": 768,
"initializer_range": 0.02,
"intermediate_size": 3072,
"max_position_embeddings": 512,
"num_attention_heads": 12,
"num_hidden_layers": 12,
"pooler_fc_size": 768,
"pooler_num_attention_heads": 12,
"pooler_num_fc_layers": 3,
"pooler_size_per_head": 128,
"pooler_type": "first_token_transform",
"vocab_size": 21128,
"embedding_size": 768
}
待后续Huggingface正式支持electra后,我们会更新相应版本的权重以及配置文件。谢谢。
多谢,已经成功跑通,并整合进了工具包里
https://github.com/trueto/transformers_sklearn