Yelp / MOE

A global, black box optimization engine for real world metric optimization.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[Bug] python/python_version/optimization.py

yf275 opened this issue · comments

commented

A bug exists in python/python_version/optimization.py at Line 655 and 656:

shaped_point = point.reshape(self._num_points, self.domain.dim)
self.objective_function.current_point = shaped_point

where point is an numpy.ndarray of shape (self.domain.dim, ) and self._num_points is defined to be 1. Line 656 triggers Line 233 in python/python_version/log_likelihood.py:

current_point = hyperparameters

which in turn invokes function set_hyperparameters() in Line 223 of the same file and further set_hyperparameters() in Line 63 of python/python_version/covariance.py, setting self._hyperparameters to shaped_point and its shape to (1, self.domain.dim). Note that self.domain.dim is equal to (1 + # of columns of input data). The function set_hyperparameters() in covariance.py also defines:

self._lengths_sq = numpy.copy(self._hyperparameters[1:])

This means self._lengths_sq would be equal to [] and raise a ValueError when Line 99 of the same file is executed:

temp /= self._lengths_sq

I proposed that that we should delete Line 655 of python/python_version/optimization.py and change Line 656 of the same file to:

self.objective_function.current_point = numpy.copy(point)

This will keep the shape of self._hyperparameters as (1 + # of columns of input data, ) and fix the bug.

I do not think self.domain.dim in optimization.py is the dimension of the problem when you are using it to optimize hyperparameters, it will be the dimension of the hyperparameter space. Note that optimization interface is independent from everything else, and therefore domain is not necessary the domain of the global optimization objective, it could be domain of hyperparameter space or whatever.

This is not a bug.

Sorry I've been gone for so long! See: #448 for details.

I agree with @jialeiwang here. @yf275, do you have a stacktrace from a failure? As jialei pointed out, the terms "domain" and "point" are overloaded. In the expected improvement setting, they're the physical space you're optimizing in. In the hyperparameter/log likelihood setting, they're the hyperparameter space (e.g., 1 + spatial_dim).

iirc, numpy.copy(point) on the line you referenced doesn't work b/c sometimes we have to flatten the point to match some of scipy's optimizers' expected inputs. I believe COYBLA has this issue.