fitReciprocalErrorModel IP data

Question

fitReciprocalErrorModel IP data

HenryWHR opened this issue 2 months ago · comments

Hi,

I am wondering if there is a way to fit normal/reciprocal error model for IP data. It works fine with DC data by
err = ert.fitReciprocalErrorModel(data, nBins=20, show=True, rel=False)

but have no idea how to do this with IP data in pygimli.

Thanks in advance for your kind reply!

Best regards,
Henry

Thomas Günther · Answer 1 · Fri May 24 2024 21:40:31 GMT+0800 (China Standard Time)

Very good point! It should of course also work for IP using

iperr=ert.fitReciprocalErrorModel(data, ip=True, ...)

using the approach of Flores Orozco et al. (2012), and

ert.reciprocalProcessing(data)

should do it automatically if IP data are present.

I personally don't have any IP dataset with reciprocals. Mostly I am measuring with multi-gradient array.

HenryWHR · Answer 2 · Fri May 24 2024 22:08:42 GMT+0800 (China Standard Time)

Hi Thomas,

Thank you very much for your quick reply! I attached one dataset where some N/R measurements are included. Unfortunately, neither option works for IP data on my laptop. The error says [fitReciprocalErrorModel() got an unexpected keyword argument 'ip']. With ert.reciprocalProcessing(data), the returned iperr are all zero although IP data are present.

Details

OS : Windows
CPU(s) : 14
Machine : AMD64
Architecture : 64bit
RAM : 31.7 GiB
Environment : IPython

Python 3.11.9 | packaged by conda-forge | (main, Apr 19 2024, 18:27:10) [MSC
v.1938 64 bit (AMD64)]

       pygimli : 1.5.1
        pgcore : 1.5.0
         numpy : 1.26.4
    matplotlib : 3.8.4
         scipy : 1.13.0
          tqdm : 4.66.4
       IPython : 8.20.0
       pyvista : 0.43.8

testdata.zip

Thomas Günther · Answer 3 · Fri May 31 2024 14:57:31 GMT+0800 (China Standard Time)

Dear Henry, it was nice to meet you on the IP workshop and discuss the matter. Unlike apparent resistivity, IP data may not fully fulfil reciprocity as electromagnetic effects like capacitive (cable) coupling or inductive coupling through the subsurface depend on the measuring layout which can be different for normal and reverse measurement. So one might need to distinguish systematic errors from Gaussian noise. We can extend this discussion also in the IP community

HenryWHR · Answer 4 · Sat Jun 01 2024 23:04:13 GMT+0800 (China Standard Time)

Hi Thomas,

It was a great pleasure talking to you during the IP workshop. I agree that reciprocal error can be tricky for IP data, and we should therefore be careful with it when processing. As we discussed, I will look into the source code to implement this part (error model estimation) for IP data. I believe that it is useful in some cases and not too complicated to add to pygimli.

I suggest that we close this issue in pygimli for now.

HenryWHR · Answer 5 · Tue Jul 09 2024 06:59:01 GMT+0800 (China Standard Time)

Hi Thomas,

I have now implemented the normal/reciprocal error model estimation for IP data. I believe it might be useful sometimes. The modified code and a synthetic dataset for testing are attached below. It works fine on my laptop.

def fitReciprocalErrorModel(data, nBins=None, show=False, rel=False, ip=False):
    """Fit an error by statistical normal-reciprocal analysis.

    Identify normal reciprocal pairs and fit error model to it

    Parameters
    ----------
    data : pg.DataContainerERT
        input data
    nBins : int
        number of bins to subdivide data (4 < data.size()//30 < 30)
    rel : bool [False]
        fit relative instead of absolute errors
    show : bool [False]
        generate plot of reciprocals, mean/std bins and error model

    Returns
    -------
    ab : [float, float]
        indices of a+b*R (rel=False) or a+b/R (rec=True)
    """
    R = getResistance(data)
    
    iF, iB = reciprocalIndices(data, True)
    n30 = len(iF) // 30
    nBins = nBins or np.maximum(np.minimum(n30, 30), 4)
    RR = np.abs(R[iF] + R[iB]) / 2
    sInd = np.argsort(RR)
    RR = RR[sInd]
    dR = (R[iF] - R[iB])[sInd]
    inds = np.linspace(0, len(RR), nBins+1, dtype=int)
    stdR = np.zeros(nBins)
    meanR = np.zeros(nBins)
    for b in range(nBins):
        ii = range(inds[b], inds[b+1])
        stdR[b] = np.std(dR[ii])
        meanR[b] = np.mean(RR[ii])

    G = np.ones([len(meanR), 2]) # a*x+b
    w = np.reshape(np.isfinite(meanR), [-1, 1])
    meanR[np.isnan(meanR)] = 1e-16
    stdR[np.isnan(stdR)] = 0
    if rel:
        G[:, 1] = 1 / meanR
        ab, *_ = np.linalg.lstsq(w*G, stdR/meanR, rcond=None)
    else:
        G[:, 1] = meanR
        ab, *_ = np.linalg.lstsq(w*G, stdR, rcond=None)
        
        
    if ip:
        m = data['ip']
        dm = (m[iF] - m[iB])[sInd]
        stdm = np.zeros(nBins)
        for b in range(nBins):
            ii = range(inds[b], inds[b+1])
            stdm[b] = np.std(dm[ii])
    
        # fitting the inverse power law model (Flores Orozco et al., 2012) in log-space (i.e. linear fitting)
        logstdm = np.log(stdm)
        logmR = np.log(meanR)
        G_ip = np.ones([len(logmR), 2]) # a*x+b
        w_ip = np.reshape(np.isfinite(logmR), [-1, 1])
        # logmR[np.isnan(logmR)] = 1e-16
        # logstdm[np.isnan(logstdm)] = 0
        G_ip[:, 1] = logmR
        abip, *_ = np.linalg.lstsq(w_ip*G_ip, logstdm, rcond=None)
        abip[0] = np.exp(abip[0]) # convert to the parameter in power law model

    if show:
        x = np.logspace(np.log10(min(meanR)), np.log10(max(meanR)), 30)
        _, ax = pg.plt.subplots()
        if rel:
            ax.semilogx(RR, dR/RR, '.', label='data')  # /RR
            ii = meanR > 1e-12
            x = x[x > 1e-12]
            eModel = ab[0] + ab[1] / x
            ax.plot(meanR[ii], stdR[ii]/meanR[ii], '*', label='std dev') # /meanR
            ax.set_title(r'$\delta$R/R={:.6f}+{:.6f}/|R|'.format(*ab))
            ax.set_ylabel(r'$\delta$R/R')
        else:
            eModel = ab[0] + ab[1] * x
            ax.semilogx(RR, dR, '.', label='data')  # /RR
            ax.plot(meanR, stdR, '*', label='std dev') # /meanR
            ax.set_title(r'$\delta$R={:.6f}+{:.6f}*|R|'.format(*ab))
            ax.set_ylabel(r'$\delta$R ($\Omega$)')

        ax.plot(x, eModel, '-', label='error model') # /x
        ax.grid(which='both')
        ax.set_xlabel(r'R ($\Omega$)')
        ax.legend()
        if ip:
            _, ax = pg.plt.subplots()
            eModel_ip = abip[0] * (x**abip[1])
            ax.semilogx(RR, dm, '.', label='data')  
            ax.plot(meanR, stdm, '*', label='std dev') 
            ax.set_title(r'$\delta\phi$={:.6f}*|R|^{:.6f}'.format(*abip))
            ax.set_ylabel(r'$\delta\phi$ (mrad)')
            ax.plot(x, eModel_ip, '-', label='error model')
            ax.grid(which='both')
            ax.set_xlabel(r'R ($\Omega$)')
            ax.legend()
            return ab, abip, ax
        else:
            return ab, ax
    else:
        return ab, abip

testdata.zip

Thomas Günther · Answer 6 · Tue Jul 09 2024 20:22:10 GMT+0800 (China Standard Time)

Many thanks for this effort!

I also checked the code with your testfile. With rel=False I got a resistance error of -3milliOhms plus +7% but with rel=True I got 3.7 milliOhms plus 7% which really makes sense. For the IP error I got $d\Phi=1.4|R|^(-0.5)$ which is very similar to the values in the upper right plot of Fig. 7 in Adrians paper ($1.92 |R|^(-0.61)$).

The error model is between 0.3 and 2.5 mrad (note the symlog scale) which makes sense to me.
We need to check with other files how robust this procedure is. Anyone with data is welcome to test and report.

Thomas Günther · Answer 7 · Tue Jul 09 2024 20:31:07 GMT+0800 (China Standard Time)

I just don't like the four possible different output arguments (ab only, ab/ax, ab/abip, ab/abip/ax) and that the first axes is lost in case of IP. I would rather redirect ip=True (or ip=anyarray) to do IP reciprocal analysis only so that it can also be used to do it spectrally and the output is the same.