RevolutionAnalytics / RHadoop

RHadoop

Home Page:https://github.com/RevolutionAnalytics/RHadoop/wiki

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Error in mr(map = map, reduce = reduce, combine = combine, vectorized.reduce, : hadoop streaming failed with error code 1

RimaSahl opened this issue · comments

Hi there,
I have installed hadoop 2.7.2 on ubuntu 16.04, and I have also installed Rstudio and Rhadoop (rmr2,rhdfs,rhbase) on a single node cluster. RHadoop packages are installed in this directory: "/home/hduser/R/x86_64-pc-linux-gnu-library/3.2/". however, I get error when I use simple example and hadoop streaming fails . Blow is more detail:
Can anyone please help me out ?

out<-mapreduce(input = small.ints, map=function(k,v) keyval(v,v^2))
packageJobJar: [/tmp/hadoop-unjar3635253007512617329/] [] /tmp/streamjob2173897990252478106.jar tmpDir=null
16/06/18 17:13:03 INFO client.RMProxy: Connecting to ResourceManager at /127.0.0.1:8032
16/06/18 17:13:04 INFO client.RMProxy: Connecting to ResourceManager at /127.0.0.1:8032
16/06/18 17:13:05 INFO mapred.FileInputFormat: Total input paths to process : 1
16/06/18 17:13:05 INFO mapreduce.JobSubmitter: number of splits:2
16/06/18 17:13:05 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces
16/06/18 17:13:06 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1466241060737_0001
16/06/18 17:13:06 INFO impl.YarnClientImpl: Submitted application application_1466241060737_0001
16/06/18 17:13:07 INFO mapreduce.Job: The url to track the job: http://amir-Inspiron-3521:8088/proxy/application_1466241060737_0001/
16/06/18 17:13:07 INFO mapreduce.Job: Running job: job_1466241060737_0001
16/06/18 17:13:12 INFO mapreduce.Job: Job job_1466241060737_0001 running in uber mode : false
16/06/18 17:13:12 INFO mapreduce.Job: map 0% reduce 0%
16/06/18 17:13:12 INFO mapreduce.Job: Job job_1466241060737_0001 failed with state FAILED due to: Application application_1466241060737_0001 failed 2 times due to AM Container for appattempt_1466241060737_0001_000002 exited with exitCode: -1000
For more detailed output, check application tracking page:http://amir-Inspiron-3521:8088/cluster/app/application_1466241060737_0001Then, click on links to logs of each attempt.
Diagnostics: File file:/usr/local/hadoop/"/usr/local/hadoop_tmp"/nm-local-dir/usercache/hduser/appcache/application_1466241060737_0001/"/usr/local/hadoop_tmp"/nm-local-dir/usercache/hduser does not exist
Failing this attempt. Failing the application.
16/06/18 17:13:12 INFO mapreduce.Job: Counters: 0
16/06/18 17:13:12 ERROR streaming.StreamJob: Job not successful!
Streaming Command Failed!
Error in mr(map = map, reduce = reduce, combine = combine, vectorized.reduce, :
hadoop streaming failed with error code 1

I also get this warning message whenever I load "rmr2" package

library("rmr2", lib.loc="~/R/x86_64-pc-linux-gnu-library/3.2")
Please review your hadoop settings. See help(hadoop.settings)
Warning message:
S3 methods ‘gorder.default’, ‘gorder.factor’, ‘gorder.data.frame’, ‘gorder.matrix’, ‘gorder.raw’ were declared in NAMESPACE but not found

and here is all environment variables:

Sys.getenv()
DISPLAY :0
EDITOR vi
GIT_ASKPASS rpostback-askpass
HADOOP_CMD /usr/local/hadoop/bin/hadoop
HADOOP_HOME /usr/local/hadoop
HADOOP_STREAMING /usr/local/hadoop/share/hadoop/tools/lib/hadoop-streaming-2.7.2.jar
HOME /home/hduser
LANG en_US.UTF-8
LD_LIBRARY_PATH /usr/lib/R/lib::/lib:/usr/lib/x86_64-linux-gnu:/usr/lib/jvm/default-java/jre/lib/amd64/server:@JAVA_LD@
LN_S ln -s
LOGNAME hduser
MAKE make
PAGER /usr/bin/pager
PATH /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
R_BROWSER xdg-open
R_BZIPCMD /bin/bzip2
R_DOC_DIR /usr/share/R/doc
R_GZIPCMD /bin/gzip -n
R_HOME /usr/lib/R
R_INCLUDE_DIR /usr/share/R/include
R_LIBS_SITE /usr/local/lib/R/site-library:/usr/lib/R/site-library:/usr/lib/R/library
R_LIBS_USER ~/R/x86_64-pc-linux-gnu-library/3.2
RMARKDOWN_MATHJAX_PATH /usr/lib/rstudio-server/resources/mathjax-23
R_PAPERSIZE letter
R_PAPERSIZE_USER a4
R_PDFVIEWER /usr/bin/xdg-open
R_PLATFORM x86_64-pc-linux-gnu
R_PRINTCMD /usr/bin/lpr
R_RD4PDF times,inconsolata,hyper
R_SESSION_TMPDIR /tmp/RtmpJ5Mpjt
R_SHARE_DIR /usr/share/R/share
RS_RPOSTBACK_PATH /usr/lib/rstudio-server/bin/rpostback
RSTUDIO 1
RSTUDIO_HTTP_REFERER http://127.0.0.1:8787/
RSTUDIO_PANDOC /usr/lib/rstudio-server/bin/pandoc
RSTUDIO_SESSION_STREAM hduser-d
RSTUDIO_USER_IDENTITY hduser
R_SYSTEM_ABI linux,gcc,gxx,gfortran,?
R_TEXI2DVICMD /usr/bin/texi2dvi
R_UNZIPCMD /usr/bin/unzip
R_ZIPCMD /usr/bin/zip
SED /bin/sed
SSH_ASKPASS rpostback-askpass
TAR /bin/tar
USER hduser