A solution of the Optiver Realized Volatility Kaggle competition.
In this competition, 10 minutes of book data are given and it is asked to predict the volatility value in the following 10 minutes. The realized volatility, , is computed from the log returns over all consecutive book updates:
For order book data the weighted average price (WAP) is used as the price of the stock. The formula of WAP can be written as below, which takes the top-level price and volume information into account:
Returns are widely used in finance, however, log returns are preferred whenever some mathematical modeling is required. Calling the WAP price of the stock at time t, we can define the log return between and as:
The root mean squared percent error (RMSPE) on unseen data is 0.22046, which is better than 33% of the competitors (3965 competitors)