BUG: MBM sums up MOO outputs when given a single objective acquisition function

Question

BUG: MBM sums up MOO outputs when given a single objective acquisition function

saitcakmak opened this issue 2 months ago · comments

If the generation strategy uses MBM with a single objective acquisition function on an MOO problem, the outputs are simply summed together in the acquisition function using a ScalarizedPosteriorTransform.

Discovered while investigating #2514

Repro:

Notebook for Meta employees: N5489742

Setup the problem using AxClient

import random

from ax.modelbridge.generation_strategy import GenerationStep, GenerationStrategy
from ax.modelbridge.registry import Models
from ax.service.ax_client import AxClient, ObjectiveProperties
from botorch.acquisition.monte_carlo import qNoisyExpectedImprovement


generation_strategy = GenerationStrategy(
    steps=[
        GenerationStep(
            model=Models.SOBOL,
            num_trials=2,
            min_trials_observed=1,
        ),
        GenerationStep(
            model=Models.BOTORCH_MODULAR,
            num_trials=-1,
            model_kwargs={
                "botorch_acqf_class": qNoisyExpectedImprovement,
            },
        ),
    ]
)

ax_client = AxClient(generation_strategy=generation_strategy)

ax_client.create_experiment(
    name="test_experiment",
    parameters=[
        {
            "name": "x1",
            "type": "range",
            "bounds": [0.0, 1.0],
        },
        {
            "name": "x2",
            "type": "range",
            "bounds": [0.0, 1.0],
        },
    ],
    objectives={
        "a": ObjectiveProperties(
            minimize=False,
        ),
        "b": ObjectiveProperties(
            minimize=False,
        ),
    },
)

def evaluate(parameters):
    return {"a": (random.random(), 0.0), "b": (random.random(), 0.0)}


for i in range(5):
    parameterization, trial_index = ax_client.get_next_trial()
    ax_client.complete_trial(
        trial_index=trial_index, raw_data=evaluate(parameterization)
    )

This runs fine and generates candidates.

Investigate arguments to acquisition function

from unittest import mock
with mock.patch.object(qNoisyExpectedImprovement, "__init__", side_effect=Exception) as mock_acqf:
    parameterization, trial_index = ax_client.get_next_trial()

This will raise an exception. Ignore it and check kwargs.

mock_acqf.call_args.kwargs["posterior_transform"]

This is a ScalarizedPosteriorTransform with weights tensor([1., 1.], dtype=torch.float64).

We can check opt config to verify that this is not an experiment setup issue.

ax_client.experiment.optimization_config
# MultiObjectiveOptimizationConfig(objective=MultiObjective(objectives=[Objective(metric_name="a", minimize=False), Objective(metric_name="b", minimize=False)]), outcome_constraints=[], objective_thresholds=[])`

Expected behavior

We can't do MOO using a single objective acquisition function. We should not be silently scalarizing the outputs. It should raise an informative error.

Zhong Yuan Lai · Answer 1 · Fri Jun 14 2024 08:39:30 GMT+0800 (China Standard Time)

Actually, in my (very limited) knowledge, isn't this how MOBO is supposed to work? If you look at the BoTorch documentation for MOBO, especially where the model is initialized, you find:

def initialize_model(train_x, train_obj):
    # define models for objective and constraint
    train_x = normalize(train_x, problem.bounds)
    models = []
    for i in range(train_obj.shape[-1]):
        train_y = train_obj[..., i : i + 1]
        train_yvar = torch.full_like(train_y, NOISE_SE[i] ** 2)
        models.append(
            FixedNoiseGP(
                train_x, train_y, train_yvar, outcome_transform=Standardize(m=1)
            )
        )
    model = ModelListGP(*models)
    mll = SumMarginalLogLikelihood(model.likelihood, model)
    return mll, model

(in our case we are discussing SingleTaskGPs, but this does not change the nature of the problem, I think). If this tracks I would think that a ModelList containing two SingleTaskGPs would be the way to do MOBO as well, no?

Edit: This is the case for qNParEGO at least, for example, which uses Chebyshev scalarization, as far as I can see.

Sait Cakmak · Answer 2 · Fri Jun 14 2024 13:17:55 GMT+0800 (China Standard Time)

So, the part about constructing a multi-output surrogate model is correct. That should indeed happen. The issue is scalarizing the outputs from the model, using an arbitrary sum. We do support ScalarizedObjective, which is the intentional way of doing this, where the user can specify the scalarization weights.

qNParEGO is also different here, since it is a multi-objective acquisition function that internally uses a scalarization. It is defined to work with multiple objectives and the behavior is by design.

The issue is doing this silently using arbitrary weights (well, they're just 1 for maximization and -1 for minimization) with acquisition functions that are not designed for multi-objective optimization.

Sait Cakmak · Answer 3 · Thu Jul 25 2024 06:46:17 GMT+0800 (China Standard Time)

The same issue happens with the legacy models as well. It is a problem with the way we extract the objective weights from optimization config in TorchModelBridge._get_transformed_model_gen_args. This just converts the optimization config into an array of weights. The model has no idea whether those weights represent a scalarized objective or multi-objective.

For legacy single-objective models, these get passed to _get_acquisition_function -> get_objective_weights_transform, which treats them as scalarization weights, and uses them to construct a GenericMCObjective (which hides the original weights in some local function).

For MBM with single-objective acquisition functions, these are passed through Acquisition.get_botorch_objective_and_transform, which uses them to construct an objective or posterior transform depending on whether there are constraints and the type of acquisition function being used.