In the RL algorithms, why label in training set is at t+2 while which in validate set is at time t+1 ?

Question

In the RL algorithms, why label in training set is at t+2 while which in validate set is at time t+1 ?

xtyangjie opened this issue 6 years ago · comments

As asked in the title, in base/env/market.py, when generating sequences from orignal dataset, the label of training set is at time of t+2(data_index + 1) compared to x(from data_index - seq_len to data_index, a semi-closed interval), while when choosing label for the validate set, it is at t+1 instead.

Note that there is a tiny bug in master in the statement of instruments_y assiginment and I fixed in according to the code in dev branch.

`
if date_index < self.bound_index:

    # Get y, y is not at date index, but plus 1. (Training Set)

    instruments_y = scaled_frame.iloc[date_index + 1]['close']

else:

    # Get y, y is at date index. (Test Set)

    instruments_y = scaled_frame.iloc[date_index]['close'] # data_index + 1 --> data_index here

`

Thanks for your answer :)

yang jie · Answer 1 · Fri Jun 29 2018 12:23:19 GMT+0800 (China Standard Time)

In addition, I had post an email to @Ceruleanacg before, while it may be a better approach to send the question here (to make a record). It could help owners to collect information about my question, too.

Another question in the mail is posted in the next issue seperately.

Shuyu · Answer 2 · Tue Jul 03 2018 08:35:09 GMT+0800 (China Standard Time)

We had a talk in WeChat.