adaptivecomputing / torque

Torque Repository

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Permissions of the torque root cgroup are wrong for interactive jobs

itkovian opened this issue · comments

Hi,

In case of an interactive job, we see

drwx------ 3 root root 0 Jun 9 16:53 torque

in the memory cgroup if for some reason the torque cgroup has dissappeared. Some issues regarding this were fixed in #367, but I did not check this for interactive jobs.

Kind regards,
-- Andy

Andy,

I am running from 6.0-dev and do not see the same permissions you have
listed. Can you give more information about the way the job was called?

Ken

On Thu, Jun 9, 2016 at 8:56 AM, Andy Georges notifications@github.com
wrote:

Hi,

In case of an interactive job, we see

drwx------ 3 root root 0 Jun 9 16:53 torque

in the memory cgroup if for some reason the torque cgroup has
dissappeared. Some issues regarding this were fixed in #367
#367, but I did not
check this for interactive jobs.

Kind regards,
-- Andy


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#374, or mute the
thread
https://github.com/notifications/unsubscribe/ACCEHLJpiVaLJwJ8aEfzfd0VXj3hv6XTks5qKCmLgaJpZM4IyEHE
.

[image: Adaptive Computing] http://www.adaptivecomputing.com
[image: Twitter] http://twitter.com/AdaptiveMoab [image: LinkedIn]
http://www.linkedin.com/company/448673?goback=.fcs_GLHD_adaptive+computing_false_*2_*2_*2_*2_*2_*2_*2_*2_*2_*2_*2_*2&trk=ncsrch_hits
[image:
YouTube] http://www.youtube.com/adaptivecomputing [image: GooglePlus]
https://plus.google.com/u/0/102155039310685515037/posts [image: Facebook]
http://www.facebook.com/pages/Adaptive-Computing/314449798572695?fref=ts
[image:
RSS] http://www.adaptivecomputing.com/feed
Ken Nielson Sr. Software Engineer
+1 801.717.3700 office +1 801.717.3738 fax
1712 S. East Bay Blvd, Suite 300 Provo, UT 84606
www.adaptivecomputing.com

Hi Ken,

Job was submitted with

vsc40075@gligar01 (banette) ~> qsub -l walltime=10:00 -l nodes=1:ppn=1 -I -l vmem=1g
qsub: waiting for job 306.master23.banette.gent.vsc to start
qsub: job 306.master23.banette.gent.vsc ready

So that's with -I (interactive, not clear from the font when I'm typing this)

Situation on the node prior to submitting a job.

[root@node2801 memory]# pwd
/sys/fs/cgroup/memory
[root@node2801 memory]# ls -l torque
ls: cannot access torque: No such file or directory
[root@node2801 memory]# systemctl status pbs_mom
● pbs_mom.service - TORQUE pbs_mom daemon
   Loaded: loaded (/usr/lib/systemd/system/pbs_mom.service; enabled; vendor preset: disabled)
  Drop-In: /etc/systemd/system/pbs_mom.service.d
           └─quattor.conf
   Active: active (running) since Thu 2016-06-09 21:18:14 CEST; 12h ago
 Main PID: 29057 (pbs_mom)
   CGroup: /system.slice/pbs_mom.service
           └─29057 /usr/sbin/pbs_mom -d /var/spool/pbs -H node2801.banette.gent.vsc

Jun 09 21:18:14 node2801.banette.os systemd[1]: Starting TORQUE pbs_mom daemon...
Jun 09 21:18:14 node2801.banette.os systemd[1]: Started TORQUE pbs_mom daemon.

Situation after job started:

[root@node2801 memory]# ps faux
<snip>
root     26071  0.0  1.0 131704 42312 pts/1    Ss   09:56   0:00  \_ /usr/sbin/pbs_mom -d /var/spool/pbs -H node2801.b
root     26072  0.0  1.0 131704 41324 pts/1    S    09:56   0:00      \_ /usr/sbin/pbs_mom -d /var/spool/pbs -H node28
vsc40075 26279  0.1  0.0  24756  2388 pts/1    S+   09:56   0:00      \_ -bash
[root@node2801 memory]# ls -ld torque
drwx------ 3 root root 0 Jun 10 09:56 torque

In torque 6.0-dev, commit 24827b8, I cannot reproduce this behaviour. However, I could not reproduce with the versions I used above either, so I am not sure what changed. Should it pop up again, I'll let you know.

Andy,

Thanks for the help. Glad to know one way or another the issue is resolved

Ken