jupyter-incubator / sparkmagic

Jupyter magics and kernels for working with remote Spark clusters

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[BUG] Spark Job Progress widget not working after %%configure -f

rankol opened this issue · comments

Describe the bug

I am using a PySpark kernel inside a Jupyter notebook attached to an EMR cluster. When I first start my PySpark session and run some code to calculate Pi I am able to see a widget that indicates the progress of the submitted code on the EMR cluster.

When I attempt to use this command to re-configure the Spark session I am no long able to see the widget properly.

%%configure -f {"name":"remotesparkmagics", "executorMemory":"34490M", "driverMemory":"34490M", "conf": {"spark.pyspark.python": "python3", "spark.pyspark.virtualenv.enabled": "true", "spark.pyspark.virtualenv.type": "native", "spark.pyspark.virtualenv.bin.path": "/usr/bin/virtualenv"}}

The error message is

There was an error connecting to your cluster: JSONDecodeError('Expecting value: line 1 column 1 (char 0)')
This is most often caused by an attempt to monitor a session that is no longer active.

I have been able to workaround this by running %%cleanup -f then something such as sc.list_packages() to start a new session. After that I can run my code to calculate Pi and I can see the Spark Job Progress widget working properly.

To Reproduce

  1. Start Spark session
  2. Run code to calculate Pi using Spark
  3. Re-configure Spark session with %%configure -f
  4. Run code to calculate Pi
  5. Run %%cleanup -f
  6. Run sc.list_packages() to start new Spark session
  7. Run code to calculate Pi

Expected behavior
I would like to determine a way to have the widget work without needing to cleanup my old session if that is possible.

Screenshots
Currently can't upload my screenshots.

Versions:

  • SparkMagic (unsure)
  • Livy 0.7.1
  • Spark 3.1.2