ManinderpreetPuri / Text-analysis-SAS-Enterprise-Miner-and-Rstudio

I worked as a social marketing analyst in a consulting company to uncover the impacts of online advertising and communication with customers. The aim of the study is to educate the marketing teams of their clients (in diverse industries) to market their products and/or services on social media to maximise customers’ involvement (positive interest and sharing). The company is interested in finding out the relationship between the keywords, comments, sentiments and whether there is a relationship in different topic categories such as entertainment, technology, sports, etc. that are of interest to different clients in various industries.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Text-analysis-SAS-Enterprise-Miner-and-Rstudio

Sentiment-Analysis

I worked as a social marketing analyst in a consulting company to uncover the impacts of online advertising and communication with customers. The aim of the study is to educate the marketing teams of their clients (in diverse industries) to market their products and/or services on social media to maximise customers’ involvement (positive interest and sharing). The company is interested in finding out the relationship between the keywords, comments, sentiments and whether there is a relationship in different topic categories such as entertainment, technology, sports, etc. that are of interest to different clients in various industries.

The consulting company collected information on articles that were shared by people on social media. The dataset contains approximately 12273 articles and a large number (with the total of 17) of features were extracted from the HTML code of the article, including the headline and the abstract of each article. (The description of the dataset is provided as an appendix.) Some of the features depend on characteristics of the service used, which could be analysed based on the meta-data provided: articles have the meta-data, such as keywords, article domain type and the total number of comments, etc. The article domain categories are: ‘Lifestyle’, ‘Scitech’, ‘Entertainment’, ‘Sports’, and ‘World’. In addition, several natural language processing features were also extracted

As a data analytics team member for the consultancy firm, I carried out a number of data analytics tasks for the consulting company using the data collected. The company is interested in identifying for each article domain: • Investigated the impact of the article properties on number of comments; • Used the SAS Enterprise Minerfor text analysis to identify key featuresin the articles and analyse their contribution towards low and high number of comments.

To achieve the above, I carried out the following data analytics tasks:

Task 1: Explored the impact of article properties Explored the data and investigated what properties of the article correlate with the high number of comments of the article on social media. • Explored the dataset to understand and manage five channels from the five types of data channels (lifestyle, entertainment, scitech, sports and world) and the associating data. In each article domain column, the value of 1 represents that the data in the 3 row is of the corresponding article domain. • Copied the separate datasets for each article domain to different Excel sheets. Sort and filter by each data channel to separate in R Studio. • In each data channel, identified the articles with a high number of comments (with the threshold of top 20% in the dataset). • Investigated the following properties and explained how they could have affected the high number of comments. o Number of words in headline o Number of words in abstract o Number of words in content o Number of keywords in the meta data o Was the article published on the weekend

Task 2: Used SAS Enterprise Miner for keyword analysis • Used the SAS Enterprise Miner to extract the keywords from the abstract in each article domain. • What are the highly used (top 5) topics in each category? • Are there common topics which span across data channels and relate to a high number of comments and a low number of comments? Used the whole dataset in the SAS Enterprise Miner to identify the relationship. Provided explanations to support my argument.

About

I worked as a social marketing analyst in a consulting company to uncover the impacts of online advertising and communication with customers. The aim of the study is to educate the marketing teams of their clients (in diverse industries) to market their products and/or services on social media to maximise customers’ involvement (positive interest and sharing). The company is interested in finding out the relationship between the keywords, comments, sentiments and whether there is a relationship in different topic categories such as entertainment, technology, sports, etc. that are of interest to different clients in various industries.


Languages

Language:R 100.0%