ghammad / pyActigraphy

Python-based open source package for actigraphy data analysis

Home Page:https://ghammad.github.io/pyActigraphy

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Mismatch between documentation and code for the computation of non parametric variables IS and IV

achey2016 opened this issue · comments

The documentation for IS and IV use formula with uncorrected variance
but the code use pandas.Series.var without specifying ddof=0, which by default correct for bias (using ddof=1).

For long recordings the results should be almost the same but for shorter recordings it could lead to slight differences with other tools.

in the documentation for IS

This variable is defined in [1]:

$$IS = \frac{d^{24h}}{d^{1h}}$$

with:

$$d^{1h} = \sum_{i}^{n}\frac{\left(x_{i}-\bar{x}\right)^{2}}{n}$$

where $x_{i}$ is the number of active (counts higher than a
predefined threshold) minutes during the $i^{th}$ period,
$\bar{x}$ is the mean of all data and $n$ is the number of
periods covered by the actigraphy data and with:

$$d^{24h} = \sum_{i}^{p} \frac{ \left( \bar{x}_{h,i} - \bar{x} \right)^{2} }{p}$$

What the current implementation does

$$IS = \frac{d^{24h}}{d^{1h}}$$

with:

$$d^{1h} = \sum_{i}^{n}\frac{\left(x_{i}-\bar{x}\right)^{2}}{n-1} = \mathrm{data.var()}$$

and:

$$d^{24h} = \sum_{i}^{p} \frac{ \left( \bar{x}_{h,i} - \bar{x} \right)^{2} }{p-1} = \mathrm{data.groupby([ data.index.hour, data.index.minute, data.index.second] ).mean().var()}$$