DanielBok / copulae

Multivariate data modelling with Copulas in Python

Home Page:https://copulae.readthedocs.io/en/latest/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

singular matrices in 2 dimensional elliptical copulas

joesgithubacct opened this issue · comments

Similar to a previous user, I am trying to use the Gaussian or Student-t copula fit function, and they are crashing due to singular matrices for my data set. My data set is 2 dimensional with 250 entries, and the two columns are not linear combinations of each other, or approximate linear combinations.

This is a small example I generated using a random 4x2 data frame:

rand_pd
  | 0 | 1
-- | -- | --
0.063849 | 0.939572
0.661251 | 0.998138
0.866290 | 0.826964
0.044704 | 0.241041

from copulae import GaussianCopula
g_cop = GaussianCopula(dim=2)
g_cop.fit(rand_pd)


LinAlgError Traceback (most recent call last)
in
1 from copulae import GaussianCopula
2 g_cop = GaussianCopula(dim=2) # initializing the copula
----> 3 g_cop.fit(rand_pd_2) # fit the copula to the data

~/.local/lib/python3.7/site-packages/copulae/copula/base.py in fit(self, data, x0, method, est_var, verbose, optim_options)
79 raise ValueError('Dimension of data does not match copula')
80
---> 81 CopulaEstimator(self, data, x0=x0, method=method, est_var=est_var, verbose=verbose, optim_options=optim_options)
82
83 return self

~/.local/lib/python3.7/site-packages/copulae/copula/estimator/estimator.py in init(self, copula, data, x0, method, est_var, verbose, optim_options)
64 self._verbose = verbose
65
---> 66 self.fit() # fit the copula
67
68 def fit(self):

~/.local/lib/python3.7/site-packages/copulae/copula/estimator/estimator.py in fit(self)
70 if m in {'ml', 'mpl'}:
71 MaxLikelihoodEstimator(self.copula, self.data, self.initial_params, self.optim_options, self._est_var,
---> 72 self._verbose).fit(m)
73 elif m in ('itau', 'irho'):
74 CorrInversionEstimator(self.copula, self.data, self._est_var, self._verbose).fit(m)

~/.local/lib/python3.7/site-packages/copulae/copula/estimator/max_likelihood.py in fit(self, method)
58 """
59
---> 60 res = self._optimize()
61
62 if not res['success']:

~/.local/lib/python3.7/site-packages/copulae/copula/estimator/max_likelihood.py in _optimize(self)
106
107 def _optimize(self) -> OptimizeResult:
--> 108 return minimize(self.copula_log_lik, self.initial_params, **self.optim_options)

~/.local/lib/python3.7/site-packages/scipy/optimize/_minimize.py in minimize(fun, x0, args, method, jac, hess, hessp, bounds, constraints, tol, callback, options)
607 elif meth == 'slsqp':
608 return _minimize_slsqp(fun, x0, args, jac, bounds,
--> 609 constraints, callback=callback, **options)
610 elif meth == 'trust-constr':
611 return _minimize_trustregion_constr(fun, x0, args, jac, hess, hessp,

~/.local/lib/python3.7/site-packages/scipy/optimize/slsqp.py in _minimize_slsqp(func, x0, args, jac, bounds, constraints, maxiter, ftol, iprint, disp, eps, callback, **unknown_options)
397
398 # Compute objective function
--> 399 fx = func(x)
400 try:
401 fx = float(np.asarray(fx))

~/.local/lib/python3.7/site-packages/scipy/optimize/optimize.py in function_wrapper(*wrapper_args)
298 def function_wrapper(wrapper_args):
299 ncalls[0] += 1
--> 300 return function(
(wrapper_args + args))
301
302 return ncalls, function_wrapper

~/.local/lib/python3.7/site-packages/copulae/copula/estimator/max_likelihood.py in copula_log_lik(self, param)
101 try:
102 self.copula.params = param
--> 103 return -self.copula.log_lik(self.data, to_pobs=False)
104 except ValueError: # error encountered when setting invalid parameters
105 return np.inf

~/.local/lib/python3.7/site-packages/copulae/elliptical/abstract.py in log_lik(self, data, to_pobs)
41 return -np.inf
42
---> 43 return super().log_lik(data, to_pobs=to_pobs)
44
45 @Property

~/.local/lib/python3.7/site-packages/copulae/copula/base.py in log_lik(self, data, to_pobs)
198 data = self.pobs(data) if to_pobs else data
199
--> 200 return self.pdf(data, log=True).sum()
201
202 @Property

~/.local/lib/python3.7/site-packages/copulae/utility/utils.py in internal(cls, x, *args, **kwargs)
34 x = np.asarray(x)
35
---> 36 res = np.asarray(f(cls, x, *args, **kwargs))
37 return res.item(0) if res.size == 1 else res
38

~/.local/lib/python3.7/site-packages/copulae/elliptical/gaussian.py in pdf(self, u, log)
80 sigma = self.sigma
81 q = norm.ppf(u)
---> 82 d = mvn.logpdf(q, cov=sigma) - norm.logpdf(q).sum(1)
83 return d if log else np.exp(d)
84

~/.local/lib/python3.7/site-packages/scipy/stats/_multivariate.py in logpdf(self, x, mean, cov, allow_singular)
493 dim, mean, cov = self._process_parameters(None, mean, cov)
494 x = self._process_quantiles(x, dim)
--> 495 psd = _PSD(cov, allow_singular=allow_singular)
496 out = self._logpdf(x, mean, psd.U, psd.log_pdet, psd.rank)
497 return _squeeze_output(out)

~/.local/lib/python3.7/site-packages/scipy/stats/_multivariate.py in init(self, M, cond, rcond, lower, check_finite, allow_singular)
161 d = s[s > eps]
162 if len(d) < len(s) and not allow_singular:
--> 163 raise np.linalg.LinAlgError('singular matrix')
164 s_pinv = _pinv_1d(s, eps)
165 U = np.multiply(u, np.sqrt(s_pinv))

LinAlgError: singular matrix

TL;DR download copulae 0.5.2 and see if things work

Hello, I ran some tests on my own and founds 2 ways to tackle this. You probably only need the first one.

  1. First solution doesn't need you to do anything. For the elliptical copulas, I increased the upper and lower bound for the correlation matrix parameters slightly in the optimization constraints. This relaxation should solve the issue for you
  2. You probably wouldn't need this. The second solution has to deal with data treatment. Perhaps the data at hand isn't suitable using the Gaussian Copula thus there would always be convergence issues. There are a hosts of ways to test this, its usually really laborious and mostly not the right solution. So I won't recommend going down this rabbit hole

In any case, I published a new version with the fix. Hope things work for you