Add standard deviation

Question

Add standard deviation

vnmabus opened this issue a year ago · comments

Carlos Ramos Carreño commented a year ago

Add a method for computing standard deviation of functional data both in discretized and basis expansions.

Pablo Cuesta Sierra · Answer 1 · Thu Jul 06 2023 00:30:46 GMT+0800 (China Standard Time)

There is an issue regarding the design of the std function that should be specified, which is the normalization coefficient to apply and whether it should be up to the user.

The definition of std provided in Kokoszka and Reimherr (2017) is:

$$(std_X(t) ) ^2=\frac{1}{N} \sum_{n=1}^{N} (X_n(t) - \overline{X}(t))^2.$$

This normalization by $N$ is the default used in numpy.var and numpy.std, the latter being the most natural function to use in the implementation of FDataGrid.std:

def std(X: FDataGrid) -> FDataGrid:
    return X.copy(
        data_matrix=np.array([np.std(X.data_matrix, axis=0)]),
        sample_names=("standard deviation",),
    )

However, the easiest implementation of FDataBasis.std uses the FDataBasis.cov method. FDataBasis.cov calculates the covariance using the formula:

$$(K_X(t, s) ) ^2=\frac{1}{N-1} \sum_{n=1}^{N} (X_n(t) - \bar{X}(t))(X_n(s) - \bar{X}(s)),$$

because $(N-1)$ is the default normalization used by numpy.cov.

A natural solution to this issue would be to make the signature of std similar to that of numpy.std, where there is a parameter:

ddof: int, optional
Means Delta Degrees of Freedom. The divisor used in calculations is N - ddof, where N represents the number of elements. By default ddof is zero.

But including this ddof parameter in std would require adding a similar one to the cov function.

Pablo Cuesta Sierra · Answer 2 · Thu Jul 06 2023 00:32:57 GMT+0800 (China Standard Time)

I closed this issue by accident. I'm sorry.