Summary for BwK and Pure Exploration

Question

Summary for BwK and Pure Exploration

johnantonn opened this issue 3 years ago · comments

Ioannis Antoniadis commented 3 years ago

Pure exploration:

BwK:

Videos:

People/Groups:

Ioannis Antoniadis · Answer 1 · Fri Feb 19 2021 02:22:01 GMT+0800 (China Standard Time)

The report:

https://docs.google.com/document/d/1y6gwjI4bQ7WP008928Xgq3awwasEak4QVDMQTkxkUdA/edit?usp=sharing

Ioannis Antoniadis · Answer 2 · Thu Feb 25 2021 04:23:17 GMT+0800 (China Standard Time)

There's this weird assumption in many publications for the distributions of the rewards supporting [0,1]. I don't understand, however, if this is sort of a minimum assumption or a definitive assumption, and if the derived results can support any distribution or not.

https://arxiv.org/pdf/1805.05071.pdf

Support of a real-valued function f is defined by all x's such that f(x) ≠ 0. So, what does this mean for the multi-armed bandit setting and for the problem instance where the distribution of rewards is the AUC score function? I find this to be very important about the assumptions and results on which we're basing our solution..

https://developers.google.com/machine-learning/crash-course/classification/roc-and-auc?fbclid=IwAR0LpPdXLlk_Uv1w1HHiT6mXOc2WBaPtDPaV77wVJpbMvdEIDtgNwzh_4DI