Unidata / gempak

Analysis and product generation for meteorological data.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Occasional hang on gpend

dkokron opened this issue · comments

We are experiencing occasional hangs in gpend using both gempak.v7.14.1 and gempak.v7.15.1.

Here is a stack trace for a hung gpend

#0  0x000014d10857d8c6 in msgrcv () from /lib64/libc.so.6
#1  0x000000000040d5a9 in crecv_ (itype=0x73b654 <adbuff_+20>, iwait=0x4bed14 <__NLITPACK_0>, ichan=0x73b64c <adbuff_+12>, idata=0x73b65c <adbuff_+28>, iret=0x7fff91259a18) at /intel/19.1.3.304/gempak.v7.15.1/nawips/gempak/source/syslib/crecv.c:64
#2  0x000000000040c761 in grecv (itype=2, iwait=0, ichan=12058710, idata=0, iret=0) at /intel/19.1.3.304/gempak.v7.15.1/nawips/gempak/source/syslib/grecv.f:29
#3  0x000000000040cbf6 in gget (idata=..., nw=2, iret=0) at /intel/19.1.3.304/gempak.v7.15.1/nawips/gempak/source/syslib/gget.f:44
#4  0x0000000000403182 in gpend () at /intel/19.1.3.304/gempak.v7.15.1/nawips/gempak/source/programs/gp/gpend/gpend.f:38

I understand there should be a corresponding gplt process, but none is present on the system.

We are running under SLES15sp3 on AMD chips. The code was compiled with Intel-19.1.3.304

Any ideas?

My two theories are that:

  1. The Intel compiler has a bug (or some compilation flag isn't optimal), or
  2. Some windows have been closed prior to executing gpend

What is the output of ipcs -qs when this happens?

I will add "ipcs -qs" to the things I look at the next time this happens. Any commands I should run?

I just checked one of the nodes where these hangs have been occurring and found a bunch of orphan IPCs. I'll see about adding an ipcrm to our node cleanup.

Any commands I should run?

cleanup -c (which in turn runs ipcrm) is another useful command to have in your arsenal