Plugging-in BERT
putama opened this issue · comments
Hi,
We're running your code and tried to see if we can plug-in BERT by simply adding these lines to the config json file -- but we ran into an issue:
"dataset_reader": { "type": "spider_bert", "tables_file": dataset_path + "tables.json", "dataset_path": dataset_path + "database", "lazy": false, "keep_if_unparsable": false, "loading_limit": -1, "question_token_indexers": { "tokens": { "type": "bert-pretrained", "pretrained_model": "bert-base-uncased", "do_lowercase": true, "max_pieces": 128 } } }
"question_embedder": { "token_embedders": { "tokens": { "type": "bert-pretrained", "pretrained_model": "bert-base-uncased", "top_layer_only": true, "requires_grad": true } } }
We also did modify the DatasetReader to use the BertBasicWordSplitter()
instead.
However, the training could not converge -- the exact match scores stay on zeros, both on train and dev. Did we make a mistake in our configuration or the model architecture itself is not really suitable for BERT?
Hey,
Sorry - not really sure what's the issue is here, but definitely the model should be able to train with BERT. Probably some matter of optimization - maybe you can try using a different optimizer such as bert_adam
, and play with different settings of learning rates etc'?