DanielBok / copulae

Multivariate data modelling with Copulas in Python

Home Page:https://copulae.readthedocs.io/en/latest/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Error after update

njalex22 opened this issue · comments

Hi Daniel,

I updated copulae to the newest version 0.7.3 and the notebook I previously used to test copulae no longer works. Here is a snapshot of the first part of the code:

import copulae
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import scipy.stats as st

from statsmodels.distributions.empirical_distribution import ECDF
from copulae import ClaytonCopula, FrankCopula, GumbelCopula

np.random.seed(100)

v1 = np.sort(st.norm.rvs(loc=0,scale=1.5,size=10000))
v2 = st.norm.rvs(loc=0,scale=1,size=10000)

v3 = np.sort(st.norm.rvs(loc=0,scale=1.5,size=10000))
v4 = st.norm.rvs(loc=0,scale=1,size=10000)

total_sims = pd.DataFrame([0.1*(v1+v2),0.1*(v3+v4)],index=['V1','V2']).T

plt.scatter(total_sims['V1'], total_sims['V2']);

clay_cop = ClaytonCopula(theta=1,dim=2)
clay_cop.fit(data=total_sims, x0=None)

This code yields the following error:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-1-c5dff29e8c64> in <module>
     21 
     22 clay_cop = ClaytonCopula(theta=1,dim=2)
---> 23 clay_cop.fit(data=total_sims, x0=None)

~\Miniconda3\lib\site-packages\copulae\copula\base.py in fit(self, data, x0, method, optim_options, ties, verbose, to_pobs, **kwargs)
    144 
    145         x0 = np.asarray(x0) if x0 is not None and not isinstance(x0, np.ndarray) and isinstance(x0, Collection) else x0
--> 146         self._fit_smry = fit_copula(self, data, x0, method, verbose, optim_options, kwargs.get('scale', 1))
    147 
    148         if isinstance(data, pd.DataFrame):

~\Miniconda3\lib\site-packages\copulae\copula\estimator\estimator.py in fit_copula(copula, data, x0, method, verbose, optim_options, scale)
     92     m = method.lower()
     93     if m in {'ml'}:
---> 94         x0 = initial_params(copula, data, x0)
     95         return estimate_max_likelihood_params(copula, data, x0, options, verbose, scale)
     96     elif m in ('itau', 'irho'):

~\Miniconda3\lib\site-packages\copulae\copula\estimator\estimator.py in initial_params(copula, data, x0)
    117 
    118     try:
--> 119         start = estimate_corr_inverse_params(copula, data, 'itau').params
    120         ll = copula.log_lik(data)
    121 

~\Miniconda3\lib\site-packages\copulae\copula\estimator\corr_inversion.py in estimate_corr_inverse_params(copula, data, type_)
     32         raise ValueError("Correlation Inversion must be either 'itau' or 'irho'")
     33 
---> 34     icor = fit_cor(copula, data, type_)
     35 
     36     if is_elliptical(copula):

~\Miniconda3\lib\site-packages\copulae\copula\estimator\corr_inversion.py in fit_cor(copula, data, typ)
     69     indices = tri_indices(copula.dim, 1, 'lower')
     70     if typ == 'itau':
---> 71         tau = kendall_tau(data)[indices]
     72         theta = copula.itau(tau)
     73     elif typ == 'irho':

~\Miniconda3\lib\site-packages\copulae\stats\correlation.py in kendall_tau(x, y, use)
    162     %s
    163     """
--> 164     return corr(x, y, 'kendall', use)
    165 
    166 

~\Miniconda3\lib\site-packages\copulae\stats\correlation.py in corr(x, y, method, use)
    101             raise ValueError('x must be a matrix with dimension 2')
    102         c = np.identity(x.shape[1])
--> 103         for (i, j), (c1, c2) in _yield_vectors(x, use):
    104             c[i, j] = c[j, i] = compute_corr(c1, c2)
    105 

~\Miniconda3\lib\site-packages\copulae\stats\correlation.py in _yield_vectors(x, use)
    256                 yield (i, j), (x[v, i], x[v, j])
    257             else:
--> 258                 yield (i, j), (x[:, i], x[:, j])
    259 
    260 

~\Miniconda3\lib\site-packages\pandas\core\frame.py in __getitem__(self, key)
   2900             if self.columns.nlevels > 1:
   2901                 return self._getitem_multilevel(key)
-> 2902             indexer = self.columns.get_loc(key)
   2903             if is_integer(indexer):
   2904                 indexer = [indexer]

~\Miniconda3\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
   2893             casted_key = self._maybe_cast_indexer(key)
   2894             try:
-> 2895                 return self._engine.get_loc(casted_key)
   2896             except KeyError as err:
   2897                 raise KeyError(key) from err

pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()

TypeError: '(slice(None, None, None), 0)' is an invalid key

Given this error did not occur with the previous version, is it possible there is a bug in the update? I played around with the inputs to the fit function with no success. Thanks!

Aye, replicated it too. Issue arose because I wanted to be more permissive of pandas dataframes. Didn't catch an edge case so I've added tests for it.

The fix should be pushed up by tomorrow

Should be fixed now.