sequenceiq / docker-spark

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Get permission denied when launching examples

gigq opened this issue · comments

Get the following error when launching a clean docker container and trying any of the examples:

bash-4.1# spark-submit --class org.apache.spark.examples.SparkPi --master yarn-client --driver-memory 1g --executor-memory 1g --executor-cores 1 ./lib/spark-examples-1.1.0-hadoop2.4.0.jar
...
14/11/06 18:16:22 INFO yarn.Client: Got cluster metric info from ResourceManager, number of NodeManagers: 1
14/11/06 18:16:22 INFO yarn.Client: Max mem capabililty of a single resource in this cluster 8192
14/11/06 18:16:22 INFO yarn.Client: Preparing Local resources
Exception in thread "main" org.apache.hadoop.security.AccessControlException: Permission denied: user=hdfs, access=WRITE, inode="/user":root:supergroup:drwxr-xr-x
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkFsPermission(FSPermissionChecker.java:271)

Just hit this myself. Got past it by issuing the following after I ran the image:

bash-4.1# groupadd supergroup
bash-4.1# adduser hdfs -g supergroup

After that I was able to execute Spark jobs on the YARN cluster. I got the hint from this post.

Can you pull the image now please - the ENV HADOOP_USER_NAME hdfs is not needed in the sandbox Spark container as everything runs as root. On a real cluster where the NameNode is formatted as the hdfs user that is needed, though.

Closing with the fix provided, pushed to Docker.io for rebuild and tested. Re-pulling or re-building should fix the issue.