darshan-gandhi / New-Titles-Headlines-Analysis-

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

New-Titles-Headlines-Analysis

Problem Statement:

Given data of News Title and Headline along with some other features, predict the sentiment of News Title and Headline.

Steps:

#IMPORT ALL THE NECESSARY LIBRARIES

#INSTALLING ALL THE REQUIRED DEPEDANCIES OF NLTK NEEDED FOR OUR PROGRAM

#CREATING AN OBJECT FOR STEMMING

#CREATING AN OBJECT FOR LEMMATIZATION

#CREATING A SET OF ALL STOP WORDS

#READING THE TRAIN AND TEST FILES

#OBTAINING THE INFORMATION ABOUT THE VARIOUS COLUMNS OF THE DATASET

#UNDERSTANING THE VARIOUS FEATURES OF THE DATASET SUCH AS THE MEAN, MEDIAN AND MODE

#CREATING A TABLE IN ORDER TO UNDERSTAND THE WHICH COLUMNS HAVE NULL VALUES IN THEM

#ARRANGING THE VALUES IN DECENDING ORDER IN ORDER TO GET A FAIR IDEA OF THE COLUMN WITH THE MOST NULL VALUES

#FINDING THE MODE OF THE SOURCE COLUMN

#DATA PRE-PROCESSING

#CLEANING THE TITLE COLUMN OF THE TRAINING DATASET

#CLEANING THE HEADLINE COLUMN OF THE TRAINING DATASET

#CATEGORICAL TO NUMBERICAL CONVERSION OF THE COLUMN TOPIC

#SPLITTING THE DATE AND TIME COLUMNS IN ORDER TO OBTAIN THE HOUR AND DAY FROM IT

#MAPPING THE DAYS OF THE WEEK TO NUMERIC VALUES

#APPLYING VARIOUS ALGORITHMS TO CHECK THE EFFICIENCY

#DISPLAYING THE RESULTS OBTAINED FOR THE MAE METRIC BY CARRYING OUR VARIOUS ALGORITHMS

#FINDING THE MINIMUM VALUE OF THE MAE FROM ALL THE AVAILABLE VALUES FOR TITLE

#DISPLAYING THE RESULTS OBTAINED FOR THE MAE METRIC BY CARRYING OUR VARIOUS ALGORITHMS

#GETTING THE ID FOR THE NEWS, THE TITLE SENTIMENT PREDICTED ,THE HEADLINE SENTIMENT PREDICTED

Evaluation criteria:

Mean Average Error (MAE)

Points (0-100) (Combined = 0.4 * title_sentiment + 0.6 * headline_sentiment)

Overall: 90.96

About


Languages

Language:Jupyter Notebook 100.0%