pradeepdev-1995 / BERT-models-finetuning

BERT (Bidirectional Encoder Representations from Transformers) is a transformer-based method of learning language representations. It is a bidirectional transformer pre-trained model developed using a combination of two tasks namely: masked language modeling objective and next sentence prediction on a large corpus.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

BERT Fine Tuning

Pre-trained language representations have been shown to improve many downstream NLP tasks such as question answering, and natural language inference. To apply pre-trained representations to these tasks, there are two main strategies:

1 - The feature-based approach, which uses the pre-trained representations as additional features to the downstream task.

2 - Or the fine-tuning-based approach, which trains the downstream tasks by fine-tuning pre-trained parameters.

In this repo, I work through fine-tuning different BERT pretrained models on a downstream task - multi class text classification.

These are the steps I am going to follow

1 - Load the state-of-the-art pre-trained BERT models
2 - Load the native multi class text classification dataset
3 - Fine-tune the loaded pre-trained BERT model on the multi class text classification dataset

Pre-trained models using

Following are the huggingface pretrained language models I am using for this purpose.

1 - bert-base
2 - bert-large
3 - albert
4 - roberta
5 - distilbert
6 - xlnet

This repo also contains the multi class text classification using keras deep learning framework with BERT embeddings.

1 - keras-bert

This repo also includes some of the usefull transformer wrapper libraries for fine tuning.

1 - FARM
2 - Ktrain

About

BERT (Bidirectional Encoder Representations from Transformers) is a transformer-based method of learning language representations. It is a bidirectional transformer pre-trained model developed using a combination of two tasks namely: masked language modeling objective and next sentence prediction on a large corpus.


Languages

Language:Python 100.0%