facebookresearch / theseus

A library for differentiable nonlinear optimization

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

singular matrix when using torch.linalg.cholesky()

yimingli1998 opened this issue · comments

Hi, I am using theseus to solve a simple optimization problem and it raises an error:" linalg.cholesky: (Batch element 0): The factorization could not be completed because the input is not positive-definite (the leading minor of order 2 is not positive-definite).."

The output solution is the same as the initialized solution. By checking the code, I find this is because the matrix AtA is singular. However, I don't know how to solve this problem. Here is my code:

import torch
import theseus as th


def cost_fn(optim_vars,aux_vars=None):
    energy = optim_vars[0].tensor.square().sum()
    return energy.unsqueeze(-1).unsqueeze(-1)

if __name__=="__main__":

    # define optimization variables
    N = 10
    optim_vars = th.Vector(N,name='params')
    # optimization
    cfn = th.AutoDiffCostFunction(
            [optim_vars], cost_fn, 1,  name="test"
            )
    objective = th.Objective()
    objective.add(cfn)
    optimizer = th.GaussNewton(
        objective,
        max_iterations=15,
        step_size=0.1,
        # linear_solver_kwargs={'check_singular': True},
    )

    theseus_optim = th.TheseusLayer(optimizer)
    theseus_inputs = {
        "params": torch.ones_like(optim_vars.tensor),
    }
    with torch.no_grad():
        updated_inputs, info = theseus_optim.forward(
            theseus_inputs, optimizer_kwargs={"track_best_solution": True, "verbose":True})
    print("Best solution:", info.best_solution)

Is there anything wrong with my code?

HI @yimingli1998. Note that given cost functions cf_i, theseus automatically optimizes for

torch.sum(cf_0.error() ** 2 + ... + cf_N.error() ** 2), dim=1)

, so you don't need to square and sum as you are doing inside cost_fn(), which results in an underdetermined system (you have 1 equation with 10 unknows).

I think what you want is to simply return the N=10 dimension error defined as return optim_vars[0].tensor.clone().

BTW, this system is linear, so it will just converge in one iteration. You could use th.LinearOptimizer instead of th.GaussNewton. Is this what you are tryng to solve?

Thank you for your reply! Yes, I define the error as optim_vars[0].tensor.clone() and it works well. Now I can define more complex cost functions and use the Gauss-Newton algorithm for optimization.

@yimingli1998 Hello, I am also using the Gauss-Newton algorithm for optimization, and I have encountered the problem of non-positive definiteness. I would like to ask you how to write the cost function and how to ensure positive definiteness.

Hi @gzspkgypt. The cost function is problem dependent, so we would need more information to help you write your cost function.

If you are observing non-positive definiteness you can try using Levenberg-Marquardt solver a set a non-zero damping.

@luisenp Thank you very much for your reply. The code below is a secondary optimization part that I used to edit Theseus, so that I could conveniently enter the code in the code. You can help me see what kind of error I should report.

import torch
import theseus as th

def Optimal(variables, point, state, bicycle_traj):
    device = torch.device('cuda:0')
    control_variables = th.Vector(dof=100, name="control_variables")
    ego_motion = th.Variable(torch.empty(1, 50, 4), name="ego_motion")
    current_state = th.Variable(torch.empty(1, 4), name="current_state")
    traj = th.Variable(torch.empty(1, 50, 4), name="traj")

    gains = {
            "smooth": 4.0,
            "reference_fn": 3.0
    }
    weights = {k: th.ScaleCostWeight(th.Variable(torch.tensor(v), name=f'gain_{k}')) for k, v in gains.items()}
    objective = th.Objective()
    objective = build_cost_function(objective, control_variables, ego_motion, current_state, traj, weights)
    optimizer = th.GaussNewton(objective,
                               th.CholeskyDenseSolver,
                               max_iterations=10,
                               step_size=0.3,
                               rel_err_tolerance=1e-3)
    layer = th.TheseusLayer(optimizer, vectorize=False)

    layer.to(device=device)
    planner_inputs = {
            "control_variables": variables.to(device=device, dtype=torch.float32),
            "ego_motion": point.to(device=device, dtype=torch.float32),
            "current_state": state.to(device=device, dtype=torch.float32),
            "traj": bicycle_traj.to(device=device, dtype=torch.float32),
    }

    with torch.no_grad():
        control, info = layer.forward(planner_inputs, optimizer_kwargs={'track_best_solution': True, "verbose": True})
        a = info.best_solution['control_variables'].tolist()[0]
        # a = control['control_variables'].tolist()[0]
        return a

def build_cost_function(objective, control_variables, ego_motion, current_state, traj, weights, vectorize=True):

    smooth_function = th.AutoDiffCostFunction(
        [control_variables], smooth_fn, 50, weights['smooth'],autograd_vectorize=vectorize,
        name="smooth"
    )
    objective.add(smooth_function)

    reference_x_fn = th.AutoDiffCostFunction([control_variables], ref_fn, 50, weights['reference_fn'],
                                          aux_vars=[ego_motion, current_state, traj], autograd_vectorize=vectorize,
                                          name="reference_x_fn")
    objective.add(reference_x_fn)
    return objective

def ref_fn(optim_vars, aux_vars):
    planning_point = optim_vars[0].tensor.view(-1, 50, 2)
    traj = aux_vars[2].tensor[:, :, :2]
    ref_error = torch.norm(planning_point - traj, dim=2)
    return ref_error

def smooth_fn(optim_vars, aux_vars):

    space_path_s = optim_vars[0].tensor.view(-1, 50, 2)[:, :, 0][0]     # x
    space_path_l = optim_vars[0].tensor.view(-1, 50, 2)[:, :, 1][0]     # y

    ds = torch.diff(space_path_s, n=1, dim=-1)
    dl = torch.diff(space_path_l, n=1, dim=-1)

    d2s = torch.diff(space_path_s, n=2)
    d2l = torch.diff(space_path_l, n=2)

    ds = ds[:-1]
    dl = dl[:-1]

    numerator = torch.abs(d2l * ds - dl * d2s)
    denominator = (ds ** 2 + dl ** 2) ** 1.5
    smooth_err = (numerator / denominator) * 5
    smooth_err = torch.unsqueeze(smooth_err, 0)

    rows = smooth_err.size(0)

    zeros = torch.zeros((rows, 2), dtype=smooth_err.dtype, device=smooth_err.device)

    smooth_err = torch.cat((zeros, smooth_err), dim=1)

    return smooth_err

if __name__ == '__main__':

    start = [0.0, 0.0, 0.0, 10.0]
    x = [0.0, 1.03, 2.09, 3.18, 4.297, 5.447, 6.627, 7.837, 9.077, 10.347, 11.647, 12.977, 14.337, 15.727, 17.113, 18.5, 19.886, 21.272, 22.659, 24.045, 25.431, 26.818, 28.204, 29.59, 30.977, 32.363, 33.749, 35.136, 36.522, 37.908, 39.295, 40.697, 42.099, 43.501, 44.903, 46.305, 47.707, 49.109, 50.511, 51.913, 53.315, 54.717, 56.119, 57.521, 58.923, 60.325, 61.727, 63.129, 64.531, 65.933, 67.331]

    y = [0.0, 0.0, 0.0, 0.0, 0.084, 0.084, 0.084, 0.084, 0.084, 0.084, 0.084, 0.084, 0.084, 0.084, -0.125, -0.334, -0.543, -0.752, -0.961, -1.17, -1.379, -1.588, -1.797, -2.006, -2.215, -2.424, -2.633, -2.842, -3.051, -3.26, -3.469, -3.469, -3.469, -3.469, -3.469, -3.469, -3.469, -3.469, -3.469, -3.469, -3.469, -3.469, -3.469, -3.469, -3.469, -3.469, -3.469, -3.469, -3.469, -3.469, -3.573]

    acc = [0.0, 3.0, 3.0, 3.0, 3.0, 3.0, 3.0, 3.0, 3.0, 3.0, 3.0, 3.0, 3.0, 3.0, 1.2, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]

    theta = [0.0, 0.0, 0.0, 0.0, 0.075, -0.075, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, -0.15, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.15, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, -0.075]

    yaw = [0.0, 0.0, 0.0, 0.0, 0.075, -0.0, -0.0, -0.0, -0.0, -0.0, -0.0, -0.0, -0.0, -0.0, -0.15, -0.15, -0.15, -0.15, -0.15, -0.15, -0.15, -0.15, -0.15, -0.15, -0.15, -0.15, -0.15, -0.15, -0.15, -0.15, -0.15, -0.0, -0.0, -0.0, -0.0, -0.0, -0.0, -0.0, -0.0, -0.0, -0.0, -0.0, -0.0, -0.0, -0.0, -0.0, -0.0, -0.0, -0.0, -0.0, -0.075]

    v = [10.0, 10.3, 10.6, 10.9, 11.2, 11.5, 11.8, 12.1, 12.4, 12.7, 13.0, 13.3, 13.6, 13.9, 14.0, 14.0, 14.0, 14.0, 14.0, 14.0, 14.0, 14.0, 14.0, 14.0, 14.0, 14.0, 14.0, 14.0, 14.0, 14.0, 14.0, 14.0, 14.0, 14.0, 14.0, 14.0, 14.0, 14.0, 14.0, 14.0, 14.0, 14.0, 14.0, 14.0, 14.0, 14.0, 14.0, 14.0, 14.0, 14.0, 14.0]

    traj = torch.tensor([[[ 0.0000,  0.0000,  0.0000, 10.0000],
         [ 1.0000,  0.0000,  0.0000, 10.0000],
         [ 2.0300,  0.0000,  0.0000, 10.3000],
         [ 3.0900,  0.0000,  0.0000, 10.6000],
         [ 4.1800,  0.0000,  0.0000, 10.9000],
         [ 5.2960,  0.0941,  0.0842, 11.2000],
         [ 6.4460,  0.0916,  6.2809, 11.5000],
         [ 7.6260,  0.0889,  6.2809, 11.8000],
         [ 8.8360,  0.0862,  6.2809, 12.1000],
         [10.0760,  0.0834,  6.2809, 12.4000],
         [11.3460,  0.0805,  6.2809, 12.7000],
         [12.6460,  0.0776,  6.2809, 13.0000],
         [13.9760,  0.0746,  6.2809, 13.3000],
         [15.3360,  0.0715,  6.2809, 13.6000],
         [16.7260,  0.0684,  6.2809, 13.9000],
         [18.0960, -0.2296,  6.0690, 14.0200],
         [19.4660, -0.5275,  6.0690, 14.0200],
         [20.8359, -0.8254,  6.0690, 14.0200],
         [22.2059, -1.1234,  6.0690, 14.0200],
         [23.5759, -1.4213,  6.0690, 14.0200],
         [24.9459, -1.7193,  6.0690, 14.0200],
         [26.3158, -2.0172,  6.0690, 14.0200],
         [27.6858, -2.3152,  6.0690, 14.0200],
         [29.0558, -2.6131,  6.0690, 14.0200],
         [30.4258, -2.9110,  6.0690, 14.0200],
         [31.7957, -3.2090,  6.0690, 14.0200],
         [33.1657, -3.5069,  6.0690, 14.0200],
         [34.5357, -3.5000,  6.0690, 14.0200],
         [35.9057, -3.5000,  6.0690, 14.0200],
         [37.2757, -3.5000,  6.0690, 14.0200],
         [38.6456, -3.5000,  6.0690, 14.0200],
         [40.0156, -3.5000,  6.0690, 14.0200],
         [41.4176, -3.5000,  6.2809, 14.0200],
         [42.8196, -3.5000,  6.2809, 14.0200],
         [44.2216, -3.5000,  6.2809, 14.0200],
         [45.6236, -3.5000,  6.2809, 14.0200],
         [47.0256, -3.5000,  6.2809, 14.0200],
         [48.4276, -3.5000,  6.2809, 14.0200],
         [49.8296, -3.5000,  6.2809, 14.0200],
         [51.2316, -3.5000,  6.2809, 14.0200],
         [52.6336, -3.5000,  6.2809, 14.0200],
         [54.0356, -3.5000,  6.2809, 14.0200],
         [55.4376, -3.5000,  6.2809, 14.0200],
         [56.8396, -3.5000,  6.2809, 14.0200],
         [58.2416, -3.5000,  6.2809, 14.0200],
         [59.6436, -3.5000,  6.2809, 14.0200],
         [61.0456, -3.5000,  6.2809, 14.0200],
         [62.4475, -3.5000,  6.2809, 14.0200],
         [63.8495, -3.5000,  6.2809, 14.0200],
         [65.2515, -3.5000,  6.2809, 14.0200]]])
    control_variables = torch.stack((torch.tensor(x[:50]), torch.tensor(y[:50])), dim=1).flatten()
    control_variables = control_variables.reshape(1, -1)

    current_state = torch.stack(
        (torch.tensor(acc[:50]), torch.tensor(theta[:50]), torch.tensor(yaw[:50]), torch.tensor(v[:50])),
        dim=1).unsqueeze(0)

    start_tensor = torch.tensor(start).unsqueeze(0)

    optimal_path = Optimal(control_variables, current_state, start_tensor, traj)

Hi @gzspkgypt. I'm sorry, I don't fully understand what your last question is.

@luisenp This quadratic optimization code is done using Gaussian Newton Optimizer, but the problem of non-positive determination is reported, the code is able to run directly, please help me to see why the cost function I wrote is non-positive determination, thank you very much!

@gzspkgypt Sorry for the delay, these days I've been too busy with other projects. I saw in another thread that you were able to solve this issue?