Plugging-in BERT

Question

Plugging-in BERT

putama opened this issue 5 years ago · comments

Hi,

We're running your code and tried to see if we can plug-in BERT by simply adding these lines to the config json file -- but we ran into an issue:

"dataset_reader": { "type": "spider_bert", "tables_file": dataset_path + "tables.json", "dataset_path": dataset_path + "database", "lazy": false, "keep_if_unparsable": false, "loading_limit": -1, "question_token_indexers": { "tokens": { "type": "bert-pretrained", "pretrained_model": "bert-base-uncased", "do_lowercase": true, "max_pieces": 128 } } }

"question_embedder": { "token_embedders": { "tokens": { "type": "bert-pretrained", "pretrained_model": "bert-base-uncased", "top_layer_only": true, "requires_grad": true } } }

We also did modify the DatasetReader to use the BertBasicWordSplitter() instead.

However, the training could not converge -- the exact match scores stay on zeros, both on train and dev. Did we make a mistake in our configuration or the model architecture itself is not really suitable for BERT?

Ben Bogin · Answer 1 · Fri Nov 01 2019 06:46:14 GMT+0800 (China Standard Time)

Hey,
Sorry - not really sure what's the issue is here, but definitely the model should be able to train with BERT. Probably some matter of optimization - maybe you can try using a different optimizer such as bert_adam, and play with different settings of learning rates etc'?