qubole / sparklens

Qubole Sparklens tool for performance tuning Apache Spark

Home Page:http://sparklens.qubole.com

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Cannot convert Spark event log to SparkLens report (error is null)

jrj-d opened this issue · comments

Hi, thanks for the work on Sparklens!
I'm trying to convert a historical Spark event log file into a SparkLens report with the following command:

spark-submit --packages qubole:sparklens:0.2.1-s_2.11 --class com.qubole.sparklens.app.EventHistoryToSparklensJson dummy-arg spark-event-log-file report.json

where spark-event-log-file is an uncompressed Spark event log file.
But I get the following error:

Converting Event History files to Sparklens Json files
src: /Users/jrjd/spark-event-log-file destination: /Users/jrjd/report.json
Failed to process file: /Users/jrjd/spark-event-log-file error: null
19/06/18 11:24:18 INFO ShutdownHookManager: Shutdown hook called
19/06/18 11:24:18 INFO ShutdownHookManager: Deleting directory /private/var/folders/8z/fvg7d2fd7td98rzbvs7mjvzr0000gn/T/spark-0db82d35-d70a-40ce-ad50-3cfb03395168

Here is my version of Spark:

Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 2.4.0
      /_/

Using Scala version 2.11.12, Java HotSpot(TM) 64-Bit Server VM, 1.8.0_25

By any chance do you have any idea about what is happening? I have a hard time debugging because of the error: null. Thanks!

@jrj-d Thanks for reporting. Would you be able to share the event history file to debug this?

Also, have you tried:
./bin/spark-submit --packages qubole:sparklens:0.3.0-s_2.11 --class com.qubole.sparklens.app.ReporterApp qubole-dummy-arg source=history

This will also create the sparklens.json file in /tmp/sparklens

Thanks for your quick answers!

@iamrohit when I try the new command, I get:

Exception in thread "main" java.lang.NoSuchMethodError: org.json4s.jackson.JsonMethods$.parse(Lorg/json4s/JsonInput;Z)Lorg/json4s/JsonAST$JValue;
	at com.qubole.sparklens.app.EventHistoryReporter.com$qubole$sparklens$app$EventHistoryReporter$$getFilter(EventHistoryReporter.scala:71)
	at com.qubole.sparklens.app.EventHistoryReporter$$anonfun$2.apply(EventHistoryReporter.scala:44)
	at com.qubole.sparklens.app.EventHistoryReporter$$anonfun$2.apply(EventHistoryReporter.scala:44)
	at org.apache.spark.scheduler.ReplayListenerBus$$anonfun$1.apply(ReplayListenerBus.scala:78)
	at org.apache.spark.scheduler.ReplayListenerBus$$anonfun$1.apply(ReplayListenerBus.scala:78)
	at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:464)
	at org.apache.spark.scheduler.ReplayListenerBus.replay(ReplayListenerBus.scala:80)
	at org.apache.spark.scheduler.ReplayListenerBus.replay(ReplayListenerBus.scala:58)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:483)
	at com.qubole.sparklens.app.EventHistoryReporter.<init>(EventHistoryReporter.scala:43)
	at com.qubole.sparklens.app.ReporterApp$.parseInput(ReporterApp.scala:54)
	at com.qubole.sparklens.app.ReporterApp$.delayedEndpoint$com$qubole$sparklens$app$ReporterApp$1(ReporterApp.scala:27)
	at com.qubole.sparklens.app.ReporterApp$delayedInit$body.apply(ReporterApp.scala:20)
	at scala.Function0$class.apply$mcV$sp(Function0.scala:34)
	at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:12)
	at scala.App$$anonfun$main$1.apply(App.scala:76)
	at scala.App$$anonfun$main$1.apply(App.scala:76)
	at scala.collection.immutable.List.foreach(List.scala:392)
	at scala.collection.generic.TraversableForwarder$class.foreach(TraversableForwarder.scala:35)
	at scala.App$class.main(App.scala:76)
	at com.qubole.sparklens.app.ReporterApp$.main(ReporterApp.scala:20)
	at com.qubole.sparklens.app.ReporterApp.main(ReporterApp.scala)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:483)
	at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
	at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:849)
	at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:167)
	at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:195)
	at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
	at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:924)
	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:933)
	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

The Spark distribution I use is the one at https://archive.apache.org/dist/spark/spark-2.4.0/spark-2.4.0-bin-hadoop2.7.tgz

@mayurdb the event history file I attached reproduces the issue.
application_1555318950368_0001.txt

I realized that the Spark version that produced the event history file is 2.3.2. Might this be the cause of the issue?

@jrj-d This is a known issue with Spark 2.4.0 when running with the event history file. We will fix this in the next release. To unblock yourself can you use any older spark version, 2.3.2 or 2.3.0? Should work with this!

Thanks for the help! I'll close the issue for now.