Transformer hugginface BERT model not working

Question

Transformer hugginface BERT model not working

bksaini078 opened this issue 3 years ago · comments

Bhupender Kumar Saini commented 3 years ago

While fine-tuning the transformers model i.e.transformers.TFDistilBertModel.from_pretrained(pretrained_weights)
I got this error message.

Can someone please help how to resolve this issue? Or, someone able to run the Transfomer BERT models in mac M1?

Reference code:

def BERT_model(max_len,pretrained_weights):
    '''BERT model creation with pretrained weights
    max_len: input length '''
    # parameter declaration
    learning_rate=2e-5
    optimizer=tf.keras.optimizers.Adam(learning_rate=learning_rate)

    bert=transformers.TFDistilBertModel.from_pretrained(pretrained_weights)

    # declaring inputs, BERT take input_ids and attention_mask as input
    input_ids= Input(shape=(max_len,),dtype=tf.int32,name='input_ids')
    attention_mask=Input(shape=(max_len,),dtype=tf.int32,name='attention_mask')

    distillbert= bert(input_ids,attention_mask=attention_mask)
    x= distillbert[0][:,0,:]
    x=tf.keras.layers.Dropout(0.2)(x)
    x= tf.keras.layers.Dense(64)(x)
    x=tf.keras.layers.Dense(32)(x)

    output=tf.keras.layers.Dense(2,activation='sigmoid')(x)

    model=Model(inputs=[input_ids,attention_mask],outputs=[output])
    # compiling model 
    model.compile(optimizer=optimizer,loss='binary_crossentropy', metrics=['accuracy'])
    return model
model.fit(x_train,y_train,batch_size=8,epochs=3,validation_split=0.2,verbose=1)

haesooKim · Answer 1 · Thu May 27 2021 23:35:05 GMT+0800 (China Standard Time)

change layer to
x=tf.keras.layers.Dropout(0.2)(x)
x= tf.keras.layers.Dense(64)(x)
x=tf.keras.layers.Dense(32)(x)
x=tf.keras.layers.Dense(2,activation='sigmoid')(x)
output=tf.keras.layers.Dropout(0)(x)

Because if there's an Activation function on the last layer, there's a problem, so I'm going to add a Dropout layer that doesn't do anything on the last layer.

Bhupender Kumar Saini · Answer 2 · Fri May 28 2021 02:07:42 GMT+0800 (China Standard Time)

Thank you for your reply,
I tried the proposed approach. Unfortunately, it is showing the same error message.
Did you run the BERT model successfully on your end?

Jai Smith · Answer 3 · Tue Jun 01 2021 05:14:46 GMT+0800 (China Standard Time)

hey there @bksaini078, were you able to load the BERT from tensorflow-hub? if so, would you mind showing how you did that? i'm unable to load the BERT model using hub.KerasLayer (see #276)