brodeau / aerobulk

AeroBulk is a modern-FORTRAN-based package/library that gathers state-of-the-art aerodynamic bulk formulae algorithms used to compute turbulent air-sea fluxes of momentum, heat and freshwater.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Refactoring stable/unstable conditions into IF statements

samhatfield opened this issue · comments

Hi Laurent,

I'm developing a single-precision version of NEMO at ECMWF at the moment, which means that I've come across your code, in OCE/SBC/sbcblk_algo_ecmwf.F90, if you're familiar with the NEMO structure. I've found that statements like

zzeta = MIN( pzeta(ji,jj) , 5._wp ) !! Very stable conditions (L positif and big!):
!
! Unstable (Paulson 1970):
!   eq.3.20, Chap.3, p.33, IFS doc - Cy31r1
zx = SQRT(ABS(1._wp - 16._wp*zzeta))
ztmp = 1._wp + SQRT(zx)
ztmp = ztmp*ztmp
psi_unst = LOG( 0.125_wp*ztmp*(1._wp + zx) )   &
    &       -2._wp*ATAN( SQRT(zx) ) + 0.5_wp*rpi
!
! Unstable:
! eq.3.22, Chap.3, p.33, IFS doc - Cy31r1
psi_stab = -2._wp/3._wp*(zzeta - 5._wp/0.35_wp)*EXP(-0.35_wp*zzeta) &
    &       - zzeta - 2._wp/3._wp*5._wp/0.35_wp
!
! Combining:
stab = 0.5_wp + SIGN(0.5_wp, zzeta) ! zzeta > 0 => stab = 1
!
psi_m_ecmwf(ji,jj) = (1._wp - stab) * psi_unst & ! (zzeta < 0) Unstable
    &                +      stab  * psi_stab      ! (zzeta > 0) Stable

can cause problems with single-precision, as the values computed in one stable-or-unstable branch (but then multiplied by zero at the end) can be too large to fit in a single-precision variable (e.g. over 10^39). Instead I've decided to refactor this at ECMWF as

zzeta = MIN( pzeta(ji,jj) , 5. ) !! Very stable conditions (L positif and big!):
!
zx = SQRT(ABS(1._wp - 16._wp*zzeta))
ztmp = 1._wp + SQRT(zx)
ztmp = ztmp*ztmp
!
IF (zzeta >= 0.0_wp) THEN
    ! Stable
    ! eq.3.22, Chap.3, p.33, IFS doc - Cy31r1
    psi_m_ecmwf(ji,jj) = -2._wp/3._wp*(zzeta - 5._wp/0.35_wp)*EXP(-0.35_wp*zzeta) &
        &                 - zzeta - 2._wp/3._wp*5._wp/0.35_wp
ELSE
    ! Unstable
    ! eq.3.20, Chap.3, p.33, IFS doc - Cy31r1
    psi_m_ecmwf(ji,jj) = LOG( 0.125_wp*ztmp*(1._wp + zx) )   &
        &                 -2._wp*ATAN( SQRT(zx) ) + 0.5_wp*rpi
ENDIF

which avoids potential overflow-causing intermediate values, and in any case uses fewer FLOPs.

This occurs here, here and here, as far as I can see.

Since I'm not familiar with this code, could you tell me if there's a specific reason for the original style? Do I need to be careful in refactoring these as IF statements?

Thanks!

Dear Sam,

It is definitively okay to go for the "IF" approach!
I tend to favor the current approach (avoiding IF and WHERE statements) based on (maybe outdated) vectorization considerations...

But let me drift to something more general, I'm just curious, are you aware that, presently, there is a substantial commitment of the NEMO system team and developers to make a "lossless" single-precision version of NEMO? This is led by guys at BSC in Barcelona, they have developed an automatized approach to detect routines for which it is transparent to use SP, and those for which it is not acceptable!
I think Nils W. and Kristian M. are aware about this ongoing development effort, and maybe you guys have chosen a different approach target, or are you taking part in it as well?

Cheers, /laurent

Great, I'll stick with the IF approach for now and see if it causes any problems! You're right about vectorisation - I did have to introduce another DO loop for one of the examples I gave above.

Rest assured that we're working very closely with the BSC guys :) In fact I'm testing a version of NEMO 4.0.1 that Oriol Tinto gave me. I'm not sure if you're involved with the NEMO HPC working group, but ECMWF and BSC will give a joint telco presentation on this tomorrow.