every job stderr returns "permission denied"
wghilliard opened this issue · comments
Hello, I have installed torque from source with the pam module (and without the pam module) and every time a non-root user submits a job, the stderr output file prints some variant of the following:
$USER@mybox: qsub myscript.sh
-bash: line 1: /var/spool/torque/mom_priv/jobs/0.mybox.SC: Permission denied
$USER@mybox: cat myscript.sh
echo 'hello world'
$USER@mybox: stat myscript.sh
File: 'myscript.sh'
Size: 25 Blocks: 1 IO Block: 512 regular file
Device: 2bh/43d Inode: 12718771 Links: 1
Access: (0777/-rwxrwxrwx) Uid: ( 1000/$USER) Gid: ( 1000/$USER)
Access: 2016-11-07 13:58:41.096290480 -0600
Modify: 2016-11-07 11:50:04.890717155 -0600
Change: 2016-11-07 11:50:04.994712220 -0600
Birth: -
Torque Version:
root@mybox:~# pbs_server --version
Version: 6.0.2
Commit: d9a34839a0f975d5c487bbfcf5dcb10b6a8f1e79
./configure output:
Building components: server=yes mom=yes clients=yes gui=no drmaa=no pam=yes
PBS Machine type : linux
Remote copy : /usr/bin/scp -rpB
PBS home : /var/spool/torque
Default server : mybox
Unix Domain sockets :
Linux cpusets : no
Tcl : disabled
Tk : disabled
Authentication : classic (pbs_iff)
OS info
root@mybox:~# uname -a
Linux mybox 4.4.0-45-generic #66-Ubuntu SMP Wed Oct 19 14:12:37 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
Something must be wrong with the permissions on the machine. What are the permissions set to on /var/spool/torque/mom_priv_jobs/ on that machine? Can you run interactive jobs?
root@mybox:/var/spool/torque# stat ./mom_priv/jobs/
File: './mom_priv/jobs/'
Size: 2 Blocks: 1 IO Block: 131072 directory
Device: 2eh/46d Inode: 60 Links: 2
Access: (0751/drwxr-x--x) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2016-11-11 16:14:31.270695877 -0600
Modify: 2016-11-11 16:14:47.781986313 -0600
Change: 2016-11-11 16:14:47.781986313 -0600
Birth: -
Yes! I can run interactively:
$USER@mybox:~$ qsub -I
qsub: waiting for job 6.mybox to start
qsub: job 6.mybox ready
$USER@mybox:~$
It seems that the permissions issue is just around the job script on the mom. Did you check the permissions on that directory and any files in it?
There current are not any files in the $TORQUE_HOME/mom_priv/jobs directory but when I place a file there with my user as the owner, I cannot execute the script without executing sh or bash and passing the file as an argument:
$USER@mybox:~$ /var/spool/torque/mom_priv/jobs/myscript.sh
bash: /var/spool/torque/mom_priv/jobs/myscript.sh: Permission denied
$USER@mybox:~$ sh /var/spool/torque/mom_priv/jobs/myscript.sh
hello friend
$USER@mybox:~$ ls -l /var/spool/torque/mom_priv/jobs/
ls: cannot open directory '/var/spool/torque/mom_priv/jobs/': Permission denied
$USER@mybox:~$ sudo !!
sudo ls -l /var/spool/torque/mom_priv/jobs/
total 1
-rwxr-xr-x 1 $USER root 37 Nov 11 16:38 myscript.sh
$USER@mybox:~$ ls -l /var/spool/torque/mom_priv/
ls: cannot open directory '/var/spool/torque/mom_priv/': Permission denied
$USER@mybox:~$ sudo !!
sudo ls -l /var/spool/torque/mom_priv/
total 2
-rw-r--r-- 1 root root 25 Nov 11 16:14 config
-rw-r--r-- 1 root root 23 Nov 11 16:13 config~
drwxr-x--x 2 root root 3 Nov 11 16:45 jobs
-rw-r--r-- 1 root root 7 Nov 11 16:14 mom.lock
$USER@mybox:~$
Am I making some obvious Linux mistake??
It looks like your permissions are okay for the jobs directory. When I run a job, the permissions on the job script are:
-rwx------ 1
What are the permissions on your job script when you get the error?
Hey so the this is a stat of the temp file torque creates when the job is submitted, is that the job script you are referring to?
root@mybox:~# stat /var/spool/torque/mom_priv/jobs/11.mybox.SC
File: '/var/spool/torque/mom_priv/jobs/11.mybox.SC'
Size: 25 Blocks: 1 IO Block: 512 regular file
Device: 2eh/46d Inode: 516 Links: 1
Access: (0700/-rwx------) Uid: ( 1000/$USER) Gid: ( 1000/$USER)
Access: 2016-11-14 12:55:05.699956187 -0600
Modify: 2016-11-14 12:55:05.699956187 -0600
Change: 2016-11-14 12:55:05.703956019 -0600
Birth: -
It's the file with the .SC extension that you need to look at.
I messed up the filename consistency by anonymizing the data, I've updated my previous message to reflect the correct filename.
Ok, those permissions look correct. I'm really not sure what can be happening on your system, but it has to be something around the permissions the job needs. I cannot reproduce this bug.
I would check /proc/mounts and make sure the mount flags affecting
/var/spool/torque don't include "noexec" or similar that would inhibit
direct execution (e.g., /path/to/foo
) but permit directory traversal and
file reading (e.g., sh /path/to/foo
).
Michael
On Nov 15, 2016 11:38 AM, "David Beer" notifications@github.com wrote:
Ok, those permissions look correct. I'm really not sure what can be
happening on your system, but it has to be something around the permissions
the job needs. I cannot reproduce this bug.—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#400 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ACzGqRJnEBx00XZzo0Bk2XqFv1z3pzLxks5q-fv1gaJpZM4KwH0b
.