Lagged auto-correlations of random_normal not nul
yruprich opened this issue · comments
Description of the bug
Hi, I believe there is a shortcoming with the function random_normal.
By generating vectors of T elements with this function, I find that in average the lagged auto-correlation of those vectors in not 0 at lags different than 0. The auto-correlation value tends to -1/(T-1).
Example:
N = toint(10^7)
T = 100
invT = -1./(T-1.)
sd = 1
av = 0
mxlag = 10
random_setallseed(1,1) ; (36484749, 9494848)
X = random_normal(av,sd,(/N,T/))
acf = esacr(X,mxlag)
print("mean auto-correlation of random_normal vector of length T="+T+": "+dim_avg_n_Wrap(acf,0))
print("to be compared with -1/(T-1) = "+invT)
Computing environment
I have this problem in all the 3 environments I tried:
- Linux, Ubuntu 20.04.4 LTS, NCL 6.6.2, installed with
apt install ncl-ncarg
- Linux, OpenSUSE Leap 42.3, NCL 6.3.0, installed with pre-compiled binaries "version-CentOS7.6_64bit_nodap_gnu485.tar.gz"
- Linux, Red Hat Enterprise Linux 8.4, NCL 6.6.2, built from sources
Additional context
The problem I am referring to might seem tiny. However, it leads to larger biases when those vectors are used as seeds to generate auto-regressive time series. This is also problematic in case one uses this function to create bootstrap statistical tests.
Cheers,
Yohan
Actually, I am facing the same problem with Python (v2.7.9 and v3.7.4):
import numpy as np
N = 10000000
T = 100
invT = -1./(T-1.)
sd = 1
av = 0
mxlag = 10
X = np.random.normal(av, sd, size=(N, T))
acf = X[:,0:mxlag+1]
for i in range(N):
acf[i,:] = [1. if l==0 else np.corrcoef(X[i,l:],X[i,:-l])[0][1] for l in range(mxlag+1)]
acf_mean=np.average(acf, axis=0)
print('mean auto-correlation of random_normal vector of length T=',T,' : ',acf_mean)
print('to be compared with -1/(T-1) = ',invT)
Link to the related Python issue I opened: https://stackoverflow.com/questions/73392044/lagged-auto-correlations-of-numpy-random-normal-not-nul
Actually this is not a shortcoming of the NCL function. My problem is coming from the bias in the estimate of the auto-correlation. This has been already documented back in 1954...
Reference: Marriott, F. H. C., and J. A. Pope. "Bias in the estimation of autocorrelations." Biometrika 41.3/4 (1954): 390-402 (https://www.jstor.org/stable/2332719)