python-adaptive / adaptive

:chart_with_upwards_trend: Adaptive: parallel active learning of mathematical functions

Home Page:http://adaptive.readthedocs.io/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Learner1D reports finite loss before bounds are done

basnijholt opened this issue · comments

data = {
    0.19130434782608696: 2428.6145000000006,
    0.1826086956521739: 2410.7965000000004,
    0.2: 2449.1395,
}

learner = adaptive.Learner1D(
    None,
    bounds=(0, 0.2),
    loss_per_interval=adaptive.learner.learner1D.triangle_loss,
)
for x, y in data.items():
    learner.tell(x, y)

learner.loss()

prints 0.0015347736506519973.

A typical runner goal runner = adaptive.Runner(learner, goal=lambda l: l.loss() < 0.01) would finish after two points.

I think we should report an infinite loss until the boundary points are included.

@akhmerov, @jbweston, what do you think?

I definitely agree that the loss should be larger for the case you showed.

Would you agree that the reason why a loss of 1e-3 seems "wrong" is that the points are not well-distributed on the interval [0, 0.2]? If so, perhaps we could include an extra factor into the loss to penalize this.

As you point out, reporting an infinite loss until the boundary points are included could be the pragmatic choice, at least for the case where people are just using a Runner.

My only concern would be the case that someone starts from a bunch of data that does not necessarily include the boundaries.
It would be a little strange if your eyes are telling you that you already have a pretty good sampling, but adaptive is reporting a loss of infinity.

someone starts from a bunch of data that does not necessarily include the boundaries

IIRC in this case the learner will typically return the boundaries the next time you ask, so maybe this is not too egregious.

I agree with Joe's concern about the counter-intuitive loss behavior. We could instead add to the loss something like interval_size / (x_max - x_min) - 1.