johannfaouzi / pyts

A Python package for time series classification

Home Page:https://pyts.readthedocs.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Could we keep the default setting of ensure_min_samples=1, ensure_min_features=1 in the check_array for cost_mat

KimUyen opened this issue · comments

cost_mat = check_array(cost_mat, ensure_min_samples=2,

Hi team,
I'm using metrics.dtw from pyts, it's really awesome.

I'm having an issue when trying to find the optimal path between 1-element array and n-elements array, which gives me an error at check_array for cost_mat.
Example:

from pyts.metrics import dtw
x = [0]
y = [2, 0, 1]

abcde

I'm going to keep the default setting of ensure_min_samples=1, ensure_min_features=1 in the check_array for cost_mat.
Could I do that and any risks here? Looking forward for your response.

Thanks.

Thanks for pointing this out.

Ensuring that both time series have at least 2 elements is indeed not necessary, and the code works as intended when setting ensure_min_samples=1, ensure_min_features=1.

However, when an array has only 1 element, the optimal path is trivial because there is only one possible path: it's the list of pairs of indices (0, i) (or (i, 0) if it's the second array which has 1 element) for i in {0, ..., max(len(x), len(y)) - 1}.

If you don't want to modify the source code of the package, you can use the following code to handle this special case:

import numpy as np
from pyts.metrics import dtw


x = np.array([0])
y = np.array([2, 0, 1])

if (x.size > 1) and (y.size > 1):
    dtw_score, path = dtw(x, y, return_path=True)
else:
    dtw_score = np.sqrt(np.sum((x - y) ** 2))
    if x.size == 1:
        path = np.vstack([np.zeros(y.size, dtype='int64'), np.arange(y.size)])
    else:
        path = np.vstack([np.arange(x.size), np.zeros(x.size, dtype='int64')])

Let me know if this answers your question.

Hi johannfaouzi,

Thanks for your solution and explaination. It's work to me.