st-tech / zr-obp

Open Bandit Pipeline: a python library for bandit algorithms and off-policy evaluation

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

context vector dimensions are different for data sampled with random policy and those sampled with Bernoulli TS

haanvid opened this issue · comments

Context vector dimensions are different for data sampled using uniform random policy and the data sampled using Bernoulli TS.
context vector for the data sampled with Bernoulli TS policy has dimension of 22
whereas context vector for the data sampled with random policy has dimension of 20

I was not able to find any description on what the missing 2 dimensions in the context vectors sampled with random policy are representing.
What should I do to match their dimensions?

###############################################################################
Codes that I ran:
###############################################################################
from obp.dataset import OpenBanditDataset

dataset_random = OpenBanditDataset(behavior_policy="random", campaign="all")
dataset_bts = OpenBanditDataset(behavior_policy="bts",    campaign="all")
bandit_feedback_random = dataset_random.obtain_batch_bandit_feedback()
bandit_feedback_bts = dataset_bts.obtain_batch_bandit_feedback()

bandit_feedback_random['context'].shape
>> (10000, 20)
bandit_feedback_bts['context'].shape
>> (10000, 22)
###############################################################################

If you don't like the default feature preprocessing for some reason, you can implement your own preprocessing by overriding the "pre_process" function of obp.dataset.OpenBanditDataset.

you can still implement an OPE experiment even if their context dimensions are different (may not be true for OPL experiment though)

If you don't like the default feature preprocessing for some reason, you can implement your own preprocessing by overriding the "pre_process" function of obp.dataset.OpenBanditDataset.

Actually, what I wanted to ask was 2 questions:

  1. Does the nth (for a arbitrary n) dimension of context of data sampled with random policy and Bernoulli TS policy describe the same user feature?
  2. Are some of the user features only observed in data sampled with Bernoulli TS(BTS)?
    as their are 2 more number of dimensions to describe a context.

I looked into the code where the data-preprocessing is made,
csv files for both random and BTS policies have 4 "user_features" (user_features0, user_features1, user_features2, user_features3) for each user.
And the context dimension is decided by:
number (#) of context dimension = [# of different categories in user_features0 in the loaded samples] + [# of categories in user_features1 in the loaded samples] +[# of categories in user_features2 in the loaded samples] +[# of categories in user_features3 in the loaded samples] - 4

At this point I thought that the number of context dimension for both policies could be same if I load all the data instead of 10000 samples for each policy. I thought that if there are enough number of people, than the number of different categories for each feature would be the same. Otherwise some user features are unseen on the other dataset (dataset sampled with the other policy).

So I tried loading all the data available and found that the number of context dimensions still does not match

from obp.dataset import OpenBanditDataset
dataset_bts = OpenBanditDataset(behavior_policy="bts", campaign="all", data_path='./open_bandit_dataset')
bandit_feedback_bts = dataset_bts.obtain_batch_bandit_feedback()
bandit_feedback_bts['context'].shape
(12357200, 27)

dataset_random = OpenBanditDataset(behavior_policy="random", campaign="all", data_path='./open_bandit_dataset')
bandit_feedback_random = dataset_random.obtain_batch_bandit_feedback()
bandit_feedback_random['context'].shape
(1374327, 26)

So I'm guessing that the answers for the 2 questions I wrote above are:
No in general for the first question
and yes for the 2nd question

will that be correct?

Thank you for clarifying, and you are correct. the random dataset is about 10x smaller than bts data, and may not observe some feature categories. You may want to check this by using, for example, np.unique func

Oh I see
Thanks for the reply!