Compare ceres with scipy curve_fit

Question

Compare ceres with scipy curve_fit

insar-dev opened this issue 6 months ago · comments

I compared ceres and scipy's curve_fit and found that ceres did not perform as well as curve_fit. The range calculated by ceres is 0, while the range calculated by curve_fit is more reasonable. What could be the possible reasons for this?
Code used for curve_fit:

import numpy as np
from scipy.optimize import curve_fit

def spherical_variogram_model(d,nugget,range_,psill):

#if h < range, use spherical equation, else (h>=range) adopted sill
return np.where(d<range_,nugget+(psill-nugget)*(1.5*d/range_-0.5*d**3/(range_**3)),psill)

Input data

lag = np.array([175.65234, 390.07074, 617.2337, 846.20544, 1079.8468, 1312.8428,
1545.525, 1777.5623, 2009.1091, 2239.874, 2472.7234, 2709.9663,
2941.5889, 3174.672, 3406.3713, 3639.4817, 3873.5212, 4107.823,
4341.223, 4571.714, 4805.605, 5041.677, 5271.1084, 5503.067,
5737.291, 5970.111, 6202.1523, 6437.138, 6670.4116, 6892.423])
gamma = np.array([20.815123, 21.075365, 19.551971, 20.397446, 22.835442, 24.360342,
28.06185, 27.736881, 28.389353, 27.340208, 30.65644, 31.15702,
29.316727, 28.009079, 26.655233, 24.42093, 20.318785, 23.195473,
22.581009, 23.794619, 22.329227, 22.553814, 21.483736, 20.519196,
21.407993, 18.173586, 21.122591, 25.138948, 24.998138, 34.060234])

nlag=len(lag)
#Pick a random initial value for the nugget
Nugget=(gamma[1]*lag[0]-gamma[0]*lag[1])/(lag[0]-lag[1])
if Nugget<0: Nugget=gamma[0]
#Pick a random initial value for the sill
Sill=(gamma[nlag-3]+gamma[nlag-2]+gamma[nlag-1])/3.0
#kick the starting value for the range
Range=lag[int(nlag/2)]

#Array of Initial Values. Also used in Gold Rule Fit
init_vals = [Nugget, Range, Sill]

#define the maximum values
maxlim=[max(gamma),max(lag),max(gamma)]

check=True
[Nugget,Range,Sill], _ = curve_fit(spherical_variogram_model, lag, gamma,method='trf', check_finite = check, p0=init_vals ,bounds=(0, maxlim) )
print("Estimated parameters:")
print("Range:", Range)
print("Sill:", Sill)
print("Nugget:", Nugget)

ceres :

using namespace ceres;
ceres::Problem problem;

template
class SphericalCovariance
{

public:
SphericalCovariance(T range, T sill, T nugget)
: _nugget(nugget)
, _range(range)
, _sill(sill)
{
_psill = _sill - _nugget;

public:
T compute(double d) const
{

	if (d <= _range) 
	{
		return _psill * ((3.0 * d) / (2.0 * _range) - (d * d * d) / (2.0 * _range * _range * _range)) + _nugget;
	}
	else {
		return _psill + _nugget;
	}
}

private:
T _sill;
T _range;
T _nugget;
T _psill;

};

struct ModelResidual
{
ModelResidual(double lag, double gamma)
: _lag(lag), _gamma(gamma) {}

template <typename T>
bool operator()(const T* const range,
				const T* const nugget, 
				const T* const sill, 
				T* residual) const 
{


	residual[0] = _gamma - SphericalCovariance( 
											   range[0], 
											   sill[0],
											   nugget[0]
											   ).compute(_lag);

	//residual[0] = y_ - exp(m[0] * x_ + c[0]);
	return true;
}

private:
// Observations for a sample.
const double _lag;
const double _gamma;
};

//solve

double nugget = guessNugget;
double sill = guessSill;
double range = guessRange;
for (size_t i = 0; i < lag.size(); ++i)
{
ceres::CostFunction* cost_function =
new ceres::AutoDiffCostFunction<ModelResidual, 1, 1, 1,1>(
new ModelResidual{lag[i], gamma[i]});

//SoftLOneLoss CauchyLoss HuberLoss
ceres::LossFunction* loss = nullptr;

problem.AddResidualBlock(cost_function, 
						 loss,
						 &range,
						 &nugget,
						 &sill);

}

problem.SetParameterLowerBound(&nugget, 0, 0.0);
problem.SetParameterUpperBound(&nugget, 0, maxNugget);
problem.SetParameterLowerBound(&range, 0, 0.0);
problem.SetParameterUpperBound(&range, 0, maxRange);
problem.SetParameterLowerBound(&sill, 0, 0.0);
problem.SetParameterUpperBound(&sill, 0, maxSill);

ceres::Solver::Options options;
options.linear_solver_type = ceres::DENSE_QR;
options.minimizer_progress_to_stdout = false;

ceres::Solver::Summary summary;
ceres::Solve(options, &problem, &summary);

insar-dev · Answer 1 · Fri Jan 12 2024 09:49:21 GMT+0800 (China Standard Time)

curve_fit result:

ceres result:

Sameer Agarwal · Answer 2 · Fri Jan 12 2024 09:54:05 GMT+0800 (China Standard Time)

scipy and ceres use different optimization algorithms, especially for how they deal with bounds, so I am not surprised that they give different results. Can you share the log from ceres solver iterations and the output of Solver::Summary::FullReport()?

insar-dev · Answer 3 · Fri Jan 12 2024 09:58:26 GMT+0800 (China Standard Time)

@sandwichmaker

However, I find that curve_fit seems more reasonable in this case.

Solver Summary (v 2.1.0-eigen-(3.4.0)-no_lapack-no_openmp)

                                 Original                  Reduced

Parameter blocks 3 3
Parameters 3 3
Residual blocks 30 30
Residuals 30 30

Minimizer TRUST_REGION

Dense linear algebra library EIGEN
Trust region strategy LEVENBERG_MARQUARDT

                                    Given                     Used

Linear solver DENSE_QR DENSE_QR
Threads 1 1
Linear solver ordering AUTOMATIC 3

Cost:
Initial 3.105900e+02
Final 2.238625e+02
Change 8.672749e+01

Minimizer iterations 3
Successful steps 3
Unsuccessful steps 0
Line search steps 0

Time (in seconds):
Preprocessor 0.000296

Residual only evaluation 0.000058 (3)
Line search cost evaluation 0.000000
Jacobian & residual evaluation 0.000994 (6)
Line search gradient evaluation 0.000387
Linear solver 0.000500 (3)
Line search polynomial minimization 0.000000
Minimizer 0.002030

Postprocessor 0.000010
Total 0.002336

Termination: CONVERGENCE (Function tolerance reached. |cost_change|/cost: 9.231992e-10 <= 1.000000e-06)

Sameer Agarwal · Answer 4 · Fri Jan 12 2024 10:01:50 GMT+0800 (China Standard Time)

ty, I'd like to run your code myself. I think you have most of the code pasted above but not all of it, including your initial guesses etc. Can you post your entire ceres code please?

insar-dev · Answer 5 · Fri Jan 12 2024 10:18:35 GMT+0800 (China Standard Time)

@sandwichmaker I have already uploaded the code. Thank you.

//#include "FitTest.h"
#include <glog/logging.h>
#include <ceres/ceres.h>

template
class SphericalCovariance
{

public:
SphericalCovariance(T range, T sill, T nugget)
: _nugget(nugget)
, _range(range)
, _sill(sill)
{
_psill = _sill - _nugget;

public:
T compute(double d) const
{

	if (d <= _range)
	{
		return _psill * ((3.0 * d) / (2.0 * _range) - (d * d * d) / (2.0 * _range * _range * _range)) + _nugget;
	}
	else {
		return _psill + _nugget;
	}
}

private:
T _sill;
T _range;
T _nugget;
T _psill;

};

struct ModelResidual
{
ModelResidual(double lag, double gamma)
: _lag(lag), _gamma(gamma) {}

template <typename T>
bool operator()(const T* const range,
	const T* const nugget,
	const T* const sill,
	T* residual) const
{


	residual[0] = _gamma - SphericalCovariance(
		range[0],
		sill[0],
		nugget[0]
	).compute(_lag);

	//residual[0] = y_ - exp(m[0] * x_ + c[0]);
	return true;
}

private:
// Observations for a sample.
const double _lag;
const double _gamma;
};

void ceres_test()
{
using namespace ceres;
ceres::Problem problem;

std::vector<double> lag = { 175.65234, 390.07074, 617.2337, 846.20544, 1079.8468, 1312.8428,
			1545.525, 1777.5623, 2009.1091, 2239.874, 2472.7234, 2709.9663,
			2941.5889, 3174.672, 3406.3713, 3639.4817, 3873.5212, 4107.823,
			4341.223, 4571.714, 4805.605, 5041.677, 5271.1084, 5503.067,
			5737.291, 5970.111, 6202.1523, 6437.138, 6670.4116, 6892.423 };
std::vector<double> gamma = { 20.815123, 21.075365, 19.551971, 20.397446, 22.835442, 24.360342,
			  28.06185, 27.736881, 28.389353, 27.340208, 30.65644, 31.15702,
			  29.316727, 28.009079, 26.655233, 24.42093, 20.318785, 23.195473,
			  22.581009, 23.794619, 22.329227, 22.553814, 21.483736, 20.519196,
			  21.407993, 18.173586, 21.122591, 25.138948, 24.998138, 34.060234 };


double nugget = 20.601930259621913;
double sill = 28.062121873701631;
double range = 3639.4816871947096;

double maxRange = 6892.4227592761508;
double maxNugget = 34.060233524360626;
double maxSill = 34.060233524360626;

for (size_t i = 0; i < lag.size(); ++i)
{
	ceres::CostFunction* cost_function =
		new ceres::AutoDiffCostFunction<ModelResidual, 1, 1, 1,1>(
			new ModelResidual{lag[i], gamma[i]});

	//SoftLOneLoss CauchyLoss HuberLoss
	ceres::LossFunction* loss = nullptr;

	problem.AddResidualBlock(cost_function, 
							 loss,
							 &range,
							 &nugget,
							 &sill);
}

problem.SetParameterLowerBound(&nugget, 0, 0.0);
problem.SetParameterUpperBound(&nugget, 0, maxNugget);
problem.SetParameterLowerBound(&range,  0, 0.0);
problem.SetParameterUpperBound(&range,  0, maxRange);
problem.SetParameterLowerBound(&sill,   0, 0.0);
problem.SetParameterUpperBound(&sill,   0, maxSill);

ceres::Solver::Options options;
options.linear_solver_type = ceres::DENSE_QR;
options.minimizer_progress_to_stdout = false;

ceres::Solver::Summary summary;
ceres::Solve(options, &problem, &summary);

std::cout << summary.FullReport() << std::endl;
std::cout << "Fitted parameters:" << std::endl;
std::cout << "Nugget = " << nugget << std::endl;
std::cout << "Range = " << range << std::endl;
std::cout << "Sill = " << sill << std::endl;

}

Sameer Agarwal · Answer 6 · Fri Jan 12 2024 10:23:57 GMT+0800 (China Standard Time)

Ty, I will take a look at this in a bit.

Sameer Agarwal · Answer 7 · Fri Jan 12 2024 12:26:02 GMT+0800 (China Standard Time)

@insar-dev

in the python version of the code you have

def spherical_variogram_model(d,nugget,range_,psill):
   #if h < range, use spherical equation, else (h>=range) adopted sill
   return np.where(d<range_,nugget+(psill-nugget)*(1.5*d/range_-0.5*d**3/(range_**3)),psill)

which seems a bit different from what you have in the c++ version of the code

if (d <= _range)
	{
		return _psill * ((3.0 * d) / (2.0 * _range) - (d * d * d) / (2.0 * _range * _range * _range)) + _nugget;
	}
	else {
		return _psill + _nugget;
	}

In particular in the python code you are using psill - nugget when multiplying to (1.5*d/range_-0.5*d**3/(range_**3)) and the else clause only returns psill.

Are the c++ and python version of psill the same or different?

That said, I substituted the values found by scipy into the objective function and it is indeed a better minimum than the one found by ceres.

insar-dev · Answer 8 · Fri Jan 12 2024 13:25:46 GMT+0800 (China Standard Time)

@sandwichmaker

There are issues with the variable naming in the code. You can use the following code instead:

def spherical_variogram_model(d,nugget,range_,sill):

#if h < range, use spherical equation, else (h>=range) adopted sill
return np.where(d<range_,nugget+(sill-nugget)*(1.5*d/range_-0.5*d**3/(range_**3)),sill)

sill = nugget + psill

Under the Gaussian and Exponential models, their results are similar.