Vectorization can give incorrect results if similar costs depend on values not declared as optim/aux vars

Question

Vectorization can give incorrect results if similar costs depend on values not declared as optim/aux vars

luisenp opened this issue a year ago · comments

🐛 Bug

Steps to Reproduce

Here is a simple repro

import theseus as th
import torch
from theseus.core.vectorizer import Vectorize


class BadCost(th.CostFunction):
    def __init__(self, x, flag=False, name=None):
        super().__init__(th.ScaleCostWeight(1.0), name=name)
        self.x = x
        self.flag = flag
        self.register_optim_vars(["x"])

    def error(self) -> torch.Tensor:
        return (
            torch.ones_like(self.x.tensor)
            if self.flag
            else torch.zeros_like(self.x.tensor)
        )

    def jacobians(self):
        raise NotImplementedError

    def dim(self) -> int:
        return self.x.dof()

    def _copy_impl(self, new_name=None):
        return BadCost(self.x.copy(), flag=self.flag, name=new_name)  # type: ignore


o = th.Objective()
z = th.Vector(1, name="z")
o.add(BadCost(z, flag=True))
o.add(BadCost(z, flag=False))
o.update()  # to resolve batch size for vectorization
Vectorize(o)
print(o.error())

Expected behavior

The above prints tensor([[1., 1.]]) but it should print tensor([[1., 0.]]).

xphsean12 · Answer 1 · Thu May 25 2023 11:06:48 GMT+0800 (China Standard Time)

Oh. So temporarily, is there any approach that makes the flag function well? I use a similar flag in many cases, but I haven't noticed this issue

Luis Pineda · Answer 2 · Fri May 26 2023 01:10:48 GMT+0800 (China Standard Time)

For the now the two easiest workarounds, although neither is ideal are:

If you are not time constrained, turn vectorization off when you create the TheseusLayer, or
Wrap the flag as an auxiliary variable, so that it's also vectorized. For example,

class NoLongerBadCost(th.CostFunction):
    def __init__(self, x: th.Vector, flag: th.Variable, name=None):
        super().__init__(th.ScaleCostWeight(1.0), name=name)
        self.x = x
        self.flag = flag
        self.register_optim_vars(["x"])
        self.register_aux_vars(["flag"])

    def error(self) -> torch.Tensor:
        return (
            torch.where(
                self.flag.tensor, 
                torch.ones_like(self.x.tensor)), 
                torch.zeros_like(self.x.tensor),
            )
        )

I'm hoping I can add a fix before the end of the half, although, looking at the code above I'm now realizing there might not be a one-size-fits-all solution, since the correct logic probably depends on each particular cost function. I'll need to think about this.

xphsean12 · Answer 3 · Thu Jun 01 2023 11:12:40 GMT+0800 (China Standard Time)

For the now the two easiest workarounds, although neither is ideal are:

If you are not time constrained, turn vectorization off when you create the TheseusLayer, or

Wrap the flag as an auxiliary variable, so that it's also vectorized. For example,
class NoLongerBadCost(th.CostFunction):
    def __init__(self, x: th.Vector, flag: th.Variable, name=None):
        super().__init__(th.ScaleCostWeight(1.0), name=name)
        self.x = x
        self.flag = flag
        self.register_optim_vars(["x"])
        self.register_aux_vars(["flag"])

    def error(self) -> torch.Tensor:
        return (
            torch.where(
                self.flag.tensor, 
                torch.ones_like(self.x.tensor)), 
                torch.zeros_like(self.x.tensor),
            )
        )
I'm hoping I can add a fix before the end of the half, although, looking at the code above I'm now realizing there might not be a one-size-fits-all solution, since the correct logic probably depends on each particular cost function. I'll need to think about this.

Thanks for your reply. For the first point, I want to know how to turn the vectorization off?

Luis Pineda · Answer 4 · Thu Jun 01 2023 23:09:29 GMT+0800 (China Standard Time)

You can do

layer = th.TheseusLayer(optimizer, vectorize=False)

If you are using an optimizer w/o a TheseusLayer (not the recommended way), then it's off by default.