For my project, I used AGnews as one of the datasets from the hugging face library. The BERT model will be built on the AG News dataset.
AG News (AG’s News Corpus) is a sub dataset of AG's corpus of news articles constructed by assembling titles and description fields of articles from the 4 largest classes. The four classes are: World, Sports, Business, Sci/Tech
The AG News contains 30,000 training and 1,900 test samples per class.
The project aims at building, training and fine-tuning the BERT model with respect to classification on the AG News dataset.
Language: Python
Libraries: ktrain, transformers, datasets, numpy, pandas, tensorflow, timeit
Environment: Jupyter Notebook