iconara / rubydoop

Write Hadoop jobs in JRuby

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Rubydoop failed with hadoop 2.0.4

deivinsontejeda opened this issue · comments

Hi @iconara

First at all amazing job with this gem, kudos man.

Come to here to report that Hadoop in its version 2.0.4 change the MapReduce API and Rubydoop failed when running on it.

Here trace

java.lang.Exception: java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.JobContext, but class was expected
    at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:399)
Caused by: java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.JobContext, but class was expected
    at rubydoop.InstanceContainer.setup(InstanceContainer.java:40)
    at rubydoop.MapperProxy.setup(MapperProxy.java:25)
    at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
    at rubydoop.MapperProxy.run(MapperProxy.java:17)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:757)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:339)
    at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:231)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
    at java.util.concurrent.FutureTask.run(FutureTask.java:138)
    at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
    at java.lang.Thread.run(Thread.java:662)

Basically.

Perhaps you know that Hadoop 2.0.x come with great advance features (like YARN), would be great

commented

Thanks for reporting this. With that change I guess it can be difficult to support both 1.0.3 and 2.0.0. If you want to make a patch I would welcome it, otherwise you'll have to wait for Amazon to support Hadoop 2.0 in EMR -- running on EMR is basically my reason for creating and maintaining Rubydoop, but I'd appreciate any help I can get supporting other versions (I haven't even tested with CDH).

I implemented a simple work around, which moves some code from InstanceContainer to MapperProxy and ReducerProxy, since the Mapper.Context and Reducer.Context still seem to be classes in 2.0.0. With that, I was able to run the specs in 2.0.4 (and 1.0.3, obviously).

Feel free to try out the hadoop_204 branch (unfortunately, you'll have to build it yourself). https://github.com/iconara/rubydoop/commits/hadoop_204

@grddev Thank you, I'll testing and notify you how about going to me.

Hi @grddev,

Do you have some path to follow to build rubydoop.jar?

Ignore my previous comment, I've just seen a rake task into project, sorry by noise

commented

Check out the code, then run rake build, followed by gem build rubydoop.gemspec, you'll then have a rubydoop-vX.Y.Z.gem which you can install with Rubygems (just put it in the working directory and run gem install rubydoop -- Rubygems picks .gem files in the working directory over gems on rubygems.org).

You probably want to change lib/rubydoop/version.rb to not mix up the version with an official release.

@iconara thank you a lot.

@iconara Do you have some example of .classpath file that you are using for?

commented

I do this:

hadoop classpath > .classpath

On 19 jul 2013, at 20:59, Deivinson Tejeda notifications@github.com wrote:

@iconara Do you have some example of .classpath file that you are using for?


Reply to this email directly or view it on GitHub.

Finally, I could build Rubydoop with support for Hadoop 2.0.x, works fine kudos guys 👯

I'll publish a post with idea if someone needs to do this exists a recipe.