apple / tensorflow_macos

TensorFlow for macOS 11.0+ accelerated using Apple's ML Compute framework.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Transformer hugginface BERT model not working

bksaini078 opened this issue · comments

While fine-tuning the transformers model i.e.transformers.TFDistilBertModel.from_pretrained(pretrained_weights)
I got this error message.
image
Can someone please help how to resolve this issue? Or, someone able to run the Transfomer BERT models in mac M1?

Reference code:

def BERT_model(max_len,pretrained_weights):
    '''BERT model creation with pretrained weights
    max_len: input length '''
    # parameter declaration
    learning_rate=2e-5
    optimizer=tf.keras.optimizers.Adam(learning_rate=learning_rate)

    bert=transformers.TFDistilBertModel.from_pretrained(pretrained_weights)

    # declaring inputs, BERT take input_ids and attention_mask as input
    input_ids= Input(shape=(max_len,),dtype=tf.int32,name='input_ids')
    attention_mask=Input(shape=(max_len,),dtype=tf.int32,name='attention_mask')

    distillbert= bert(input_ids,attention_mask=attention_mask)
    x= distillbert[0][:,0,:]
    x=tf.keras.layers.Dropout(0.2)(x)
    x= tf.keras.layers.Dense(64)(x)
    x=tf.keras.layers.Dense(32)(x)

    output=tf.keras.layers.Dense(2,activation='sigmoid')(x)

    model=Model(inputs=[input_ids,attention_mask],outputs=[output])
    # compiling model 
    model.compile(optimizer=optimizer,loss='binary_crossentropy', metrics=['accuracy'])
    return model
model.fit(x_train,y_train,batch_size=8,epochs=3,validation_split=0.2,verbose=1)

change layer to
x=tf.keras.layers.Dropout(0.2)(x)
x= tf.keras.layers.Dense(64)(x)
x=tf.keras.layers.Dense(32)(x)
x=tf.keras.layers.Dense(2,activation='sigmoid')(x)
output=tf.keras.layers.Dropout(0)(x)

Because if there's an Activation function on the last layer, there's a problem, so I'm going to add a Dropout layer that doesn't do anything on the last layer.

Thank you for your reply,
I tried the proposed approach. Unfortunately, it is showing the same error message.
Did you run the BERT model successfully on your end?

hey there @bksaini078, were you able to load the BERT from tensorflow-hub? if so, would you mind showing how you did that? i'm unable to load the BERT model using hub.KerasLayer (see #276)