Permissions of the torque root cgroup are wrong for interactive jobs
itkovian opened this issue · comments
Hi,
In case of an interactive job, we see
drwx------ 3 root root 0 Jun 9 16:53 torque
in the memory cgroup if for some reason the torque cgroup has dissappeared. Some issues regarding this were fixed in #367, but I did not check this for interactive jobs.
Kind regards,
-- Andy
Andy,
I am running from 6.0-dev and do not see the same permissions you have
listed. Can you give more information about the way the job was called?
Ken
On Thu, Jun 9, 2016 at 8:56 AM, Andy Georges notifications@github.com
wrote:
Hi,
In case of an interactive job, we see
drwx------ 3 root root 0 Jun 9 16:53 torque
in the memory cgroup if for some reason the torque cgroup has
dissappeared. Some issues regarding this were fixed in #367
#367, but I did not
check this for interactive jobs.Kind regards,
-- Andy—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#374, or mute the
thread
https://github.com/notifications/unsubscribe/ACCEHLJpiVaLJwJ8aEfzfd0VXj3hv6XTks5qKCmLgaJpZM4IyEHE
.
[image: Adaptive Computing] http://www.adaptivecomputing.com
[image: Twitter] http://twitter.com/AdaptiveMoab [image: LinkedIn]
http://www.linkedin.com/company/448673?goback=.fcs_GLHD_adaptive+computing_false_*2_*2_*2_*2_*2_*2_*2_*2_*2_*2_*2_*2&trk=ncsrch_hits
[image:
YouTube] http://www.youtube.com/adaptivecomputing [image: GooglePlus]
https://plus.google.com/u/0/102155039310685515037/posts [image: Facebook]
http://www.facebook.com/pages/Adaptive-Computing/314449798572695?fref=ts
[image:
RSS] http://www.adaptivecomputing.com/feed
Ken Nielson Sr. Software Engineer
+1 801.717.3700 office +1 801.717.3738 fax
1712 S. East Bay Blvd, Suite 300 Provo, UT 84606
www.adaptivecomputing.com
Hi Ken,
Job was submitted with
vsc40075@gligar01 (banette) ~> qsub -l walltime=10:00 -l nodes=1:ppn=1 -I -l vmem=1g
qsub: waiting for job 306.master23.banette.gent.vsc to start
qsub: job 306.master23.banette.gent.vsc ready
So that's with -I (interactive, not clear from the font when I'm typing this)
Situation on the node prior to submitting a job.
[root@node2801 memory]# pwd
/sys/fs/cgroup/memory
[root@node2801 memory]# ls -l torque
ls: cannot access torque: No such file or directory
[root@node2801 memory]# systemctl status pbs_mom
● pbs_mom.service - TORQUE pbs_mom daemon
Loaded: loaded (/usr/lib/systemd/system/pbs_mom.service; enabled; vendor preset: disabled)
Drop-In: /etc/systemd/system/pbs_mom.service.d
└─quattor.conf
Active: active (running) since Thu 2016-06-09 21:18:14 CEST; 12h ago
Main PID: 29057 (pbs_mom)
CGroup: /system.slice/pbs_mom.service
└─29057 /usr/sbin/pbs_mom -d /var/spool/pbs -H node2801.banette.gent.vsc
Jun 09 21:18:14 node2801.banette.os systemd[1]: Starting TORQUE pbs_mom daemon...
Jun 09 21:18:14 node2801.banette.os systemd[1]: Started TORQUE pbs_mom daemon.
Situation after job started:
[root@node2801 memory]# ps faux
<snip>
root 26071 0.0 1.0 131704 42312 pts/1 Ss 09:56 0:00 \_ /usr/sbin/pbs_mom -d /var/spool/pbs -H node2801.b
root 26072 0.0 1.0 131704 41324 pts/1 S 09:56 0:00 \_ /usr/sbin/pbs_mom -d /var/spool/pbs -H node28
vsc40075 26279 0.1 0.0 24756 2388 pts/1 S+ 09:56 0:00 \_ -bash
[root@node2801 memory]# ls -ld torque
drwx------ 3 root root 0 Jun 10 09:56 torque
In torque 6.0-dev, commit 24827b8, I cannot reproduce this behaviour. However, I could not reproduce with the versions I used above either, so I am not sure what changed. Should it pop up again, I'll let you know.
Andy,
Thanks for the help. Glad to know one way or another the issue is resolved
Ken