dvdbisong / automl-toxicity-classification

Google Cloud AutoML Natural Language for Toxicity Classification

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Google Cloud AutoML Natural Language for Text Classification

Building a Language Toxicity Classification Model

Title Google Cloud AutoML Natural Language for Text Classification
Author Ekaba Bisong
Google Developer Expert in Machine Learning
Google Certified Professional Data Engineer
Website #

jigsaw automl nlp

Google Cloud AutoML for Natural Language provides the platform for designing and developing custom language models for language recognition use-cases. This project uses Google Cloud AutoML for Natural Language to develop an end-to-end language toxicity classification model to identify obscene text. The concept of neural architecture search and transfer learning are used under the hood to find the best network architecture and the optimal hyperparameter setting that improves the performance of the model.

About the Dataset

The data used in this project is from the Toxic Comment Classification Challenge on Kaggle by Jigsaw and Google. The data is modified to have a sample of 16,000 toxic and 16,000 non-toxic words as inputs to build the model on AutoML NLP.

The dataset is hosted on Kaggle and can be accessed at Toxic Comment Classification Challenge.

About

Google Cloud AutoML Natural Language for Toxicity Classification

License:MIT License


Languages

Language:Jupyter Notebook 100.0%