Cannot convert Spark event log to SparkLens report (error is null)
jrj-d opened this issue · comments
Hi, thanks for the work on Sparklens!
I'm trying to convert a historical Spark event log file into a SparkLens report with the following command:
spark-submit --packages qubole:sparklens:0.2.1-s_2.11 --class com.qubole.sparklens.app.EventHistoryToSparklensJson dummy-arg spark-event-log-file report.json
where spark-event-log-file
is an uncompressed Spark event log file.
But I get the following error:
Converting Event History files to Sparklens Json files
src: /Users/jrjd/spark-event-log-file destination: /Users/jrjd/report.json
Failed to process file: /Users/jrjd/spark-event-log-file error: null
19/06/18 11:24:18 INFO ShutdownHookManager: Shutdown hook called
19/06/18 11:24:18 INFO ShutdownHookManager: Deleting directory /private/var/folders/8z/fvg7d2fd7td98rzbvs7mjvzr0000gn/T/spark-0db82d35-d70a-40ce-ad50-3cfb03395168
Here is my version of Spark:
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/___/ .__/\_,_/_/ /_/\_\ version 2.4.0
/_/
Using Scala version 2.11.12, Java HotSpot(TM) 64-Bit Server VM, 1.8.0_25
By any chance do you have any idea about what is happening? I have a hard time debugging because of the error: null
. Thanks!
@jrj-d Thanks for reporting. Would you be able to share the event history file to debug this?
Also, have you tried:
./bin/spark-submit --packages qubole:sparklens:0.3.0-s_2.11 --class com.qubole.sparklens.app.ReporterApp qubole-dummy-arg source=history
This will also create the sparklens.json file in /tmp/sparklens
Thanks for your quick answers!
@iamrohit when I try the new command, I get:
Exception in thread "main" java.lang.NoSuchMethodError: org.json4s.jackson.JsonMethods$.parse(Lorg/json4s/JsonInput;Z)Lorg/json4s/JsonAST$JValue;
at com.qubole.sparklens.app.EventHistoryReporter.com$qubole$sparklens$app$EventHistoryReporter$$getFilter(EventHistoryReporter.scala:71)
at com.qubole.sparklens.app.EventHistoryReporter$$anonfun$2.apply(EventHistoryReporter.scala:44)
at com.qubole.sparklens.app.EventHistoryReporter$$anonfun$2.apply(EventHistoryReporter.scala:44)
at org.apache.spark.scheduler.ReplayListenerBus$$anonfun$1.apply(ReplayListenerBus.scala:78)
at org.apache.spark.scheduler.ReplayListenerBus$$anonfun$1.apply(ReplayListenerBus.scala:78)
at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:464)
at org.apache.spark.scheduler.ReplayListenerBus.replay(ReplayListenerBus.scala:80)
at org.apache.spark.scheduler.ReplayListenerBus.replay(ReplayListenerBus.scala:58)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at com.qubole.sparklens.app.EventHistoryReporter.<init>(EventHistoryReporter.scala:43)
at com.qubole.sparklens.app.ReporterApp$.parseInput(ReporterApp.scala:54)
at com.qubole.sparklens.app.ReporterApp$.delayedEndpoint$com$qubole$sparklens$app$ReporterApp$1(ReporterApp.scala:27)
at com.qubole.sparklens.app.ReporterApp$delayedInit$body.apply(ReporterApp.scala:20)
at scala.Function0$class.apply$mcV$sp(Function0.scala:34)
at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:12)
at scala.App$$anonfun$main$1.apply(App.scala:76)
at scala.App$$anonfun$main$1.apply(App.scala:76)
at scala.collection.immutable.List.foreach(List.scala:392)
at scala.collection.generic.TraversableForwarder$class.foreach(TraversableForwarder.scala:35)
at scala.App$class.main(App.scala:76)
at com.qubole.sparklens.app.ReporterApp$.main(ReporterApp.scala:20)
at com.qubole.sparklens.app.ReporterApp.main(ReporterApp.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:849)
at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:167)
at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:195)
at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:924)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:933)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
The Spark distribution I use is the one at https://archive.apache.org/dist/spark/spark-2.4.0/spark-2.4.0-bin-hadoop2.7.tgz
@mayurdb the event history file I attached reproduces the issue.
application_1555318950368_0001.txt
I realized that the Spark version that produced the event history file is 2.3.2. Might this be the cause of the issue?
@jrj-d This is a known issue with Spark 2.4.0 when running with the event history file. We will fix this in the next release. To unblock yourself can you use any older spark version, 2.3.2 or 2.3.0? Should work with this!
Thanks for the help! I'll close the issue for now.