nicholas-dinicola / Forecasting-stock-prices-based-on-semantic-analysis-of-business-news-and-social-media-posts

It is well-known that stock price movements are highly sensitive to different types of news, including global events such as outbreaks of epidemics or military conflicts, as well as what people post about specific companies on their social networks. Previous research has studied methods to predict stock price movements based on sentiment analysis of business news and social media posts of a particular company alongside its historical stock price data. In contrast, this project will also implement a new state of the art algorithm recently developed by the Google Research team called Universal Sentence Encoder, which is able to encode input text, implementing sentence level embeddings, into high dimensional vectors that will be used as predictors of future stock price movements. According to that, both the Sentient Analysis and the Universal Sentence Encoder will be performed and the outputs will be used as an exogenous variable in machine learning models which will be then compared with each other to find the best model with the aim of achieving a higher model accuracy, better performance and a more robust model than only using either the historical stock price data ad endogenous variable or implementing other natural language processing methodologies based on word embeddings.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Forecasting stock prices based on semantic analysis of business news and social media posts

Dissertation - MSc Business Analytics & Big Data

It is well-known that stock price movements are highly sensitive to different types of news, including global events such as outbreaks of epidemics or military conflicts, as well as what people post about specific companies on their social networks. Previous research has studied methods to predict stock price movements based on sentiment analysis of business news and social media posts of a particular company alongside its historical stock price data. In contrast, this project will also implement a new state of the art algorithm recently developed by the Google Research team called Universal Sentence Encoder, which is able to encode input text, implementing sentence level embeddings, into high dimensional vectors that will be used as predictors of future stock price movements. According to that, both the Sentient Analysis and the Universal Sentence Encoder will be performed and the outputs will be used as an exogenous variable in machine learning models which will be then compared with each other to find the best model with the aim of achieving a higher model accuracy, better performance and a more robust model than only using either the historical stock price data ad endogenous variable or implementing other natural language processing methodologies based on word embeddings.

About

It is well-known that stock price movements are highly sensitive to different types of news, including global events such as outbreaks of epidemics or military conflicts, as well as what people post about specific companies on their social networks. Previous research has studied methods to predict stock price movements based on sentiment analysis of business news and social media posts of a particular company alongside its historical stock price data. In contrast, this project will also implement a new state of the art algorithm recently developed by the Google Research team called Universal Sentence Encoder, which is able to encode input text, implementing sentence level embeddings, into high dimensional vectors that will be used as predictors of future stock price movements. According to that, both the Sentient Analysis and the Universal Sentence Encoder will be performed and the outputs will be used as an exogenous variable in machine learning models which will be then compared with each other to find the best model with the aim of achieving a higher model accuracy, better performance and a more robust model than only using either the historical stock price data ad endogenous variable or implementing other natural language processing methodologies based on word embeddings.

License:MIT License


Languages

Language:Jupyter Notebook 100.0%