Springboard Capstone Project 2
In the financial services and banking industry, vast amounts of resources are dedicated to pouring over, analyzing, and attempting to quantify qualitative data from news and company reports. This problem is also constantly compounded as the news cycle shortens and reporting requirements for public companies become more onerous. In this project I attempt to demonstrate the viability of using natural language processing word embeddings on SEC 8-K documents with deep learning methods to predict stock price volatility after a company experiences a major event.
This project could be useful to hedge funds, banks, corporate finance offices, and anyone else involved in trading securities on public markets.
-
Data Collection & Preprocessing
The notebook demonstrates the workflow while the scripts were run on Google Cloud to scrape the SEC Edgar database and download financial data -
Text Preprocessing
-
Machine Learning (MLP, CNN, RNN, CNN-RNN models)
The top-performing model achieved a 64% accuracy rate on the test data. This suggests using word embeddings on SEC filings could be a useful way of uncovering stock movements.
The full writeup is here as a PDF file, and a summary blog post is available on Medium.