vericast / spylon-kernel

Jupyter kernel for scala and spark

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[BUG]: Spark submit fails: No such file or directory: '/opt/spark/python/pyspark/./bin/spark-submit'

x1linwang opened this issue · comments

Describe the bug

I'm trying to run spark in jupyter notebook using the spylon-kernel but when I try to run any code, it is just stuck at "Initializing scala interpreter..." and the error in the ubuntu terminal (I'm a windows user running spark in the WSL - ubuntu 18.04 ) is attached below.

To Reproduce

Steps to reproduce the behavior:

  1. Install Anaconda3-2021.11-Linux-x86_64, java8 (openjdk-8-jdk), Spark 3.2.0 and spylon-kernel using the steps described in the attached file: spark installation instructions for Windows users.pdf
  2. Open a jupyter notebook and type sc.version or anything
  3. Observe that it is stuck at "Initializing scala interpreter..."
  4. Go to the ubuntu 18.04 terminal and see error

I used the following the set up the spark env variables: (my spark is installed in opt/spark and my python path is the anaconda 3 python path)

echo "export SPARK_HOME=/opt/spark" >> ~/.profile
echo "export PATH=$PATH:$SPARK_HOME/bin:$SPARK_HOME/sbin" >> ~/.profile
echo "export PYSPARK_PYTHON=/home/lai/anaconda3/bin/python" >> ~/.profile 
source ~/.profile

Expected behavior
I expect the scala interpreter to run with no problem and sc.version should output 3.2.0

Screenshots
A screenshot from jupyter notebook
image

Error in ubuntu terminal is in the additional context section.

Desktop (please complete the following information):

  • OS: Windows 10
  • Browser: Chrome (for jupyter notebook)
  • Version: I have java8, python 3.9, spark 3.2.0 and hadoop 3.2.

Additional context
The error code is as follows:

[MetaKernelApp] ERROR | Exception in message handler:
Traceback (most recent call last):
  File "/home/lai/anaconda3/lib/python3.9/site-packages/ipykernel/kernelbase.py", line 353, in dispatch_shell
    await result
  File "/home/lai/anaconda3/lib/python3.9/site-packages/ipykernel/kernelbase.py", line 643, in execute_request
    reply_content = self.do_execute(
  File "/home/lai/anaconda3/lib/python3.9/site-packages/metakernel/_metakernel.py", line 397, in do_execute
    retval = self.do_execute_direct(code)
  File "/home/lai/anaconda3/lib/python3.9/site-packages/spylon_kernel/scala_kernel.py", line 141, in do_execute_direct
    res = self._scalamagic.eval(code.strip(), raw=False)
  File "/home/lai/anaconda3/lib/python3.9/site-packages/spylon_kernel/scala_magic.py", line 157, in eval
    intp = self._get_scala_interpreter()
  File "/home/lai/anaconda3/lib/python3.9/site-packages/spylon_kernel/scala_magic.py", line 46, in _get_scala_interpreter
    self._interp = get_scala_interpreter()
  File "/home/lai/anaconda3/lib/python3.9/site-packages/spylon_kernel/scala_interpreter.py", line 568, in get_scala_interpreter
    scala_intp = initialize_scala_interpreter()
  File "/home/lai/anaconda3/lib/python3.9/site-packages/spylon_kernel/scala_interpreter.py", line 163, in initialize_scala_interpreter
    spark_session, spark_jvm_helpers, spark_jvm_proc = init_spark()
  File "/home/lai/anaconda3/lib/python3.9/site-packages/spylon_kernel/scala_interpreter.py", line 99, in init_spark
    spark_context = conf.spark_context(application_name)
  File "/home/lai/anaconda3/lib/python3.9/site-packages/spylon/spark/launcher.py", line 521, in spark_context
    return pyspark.SparkContext(appName=application_name, conf=spark_conf)
  File "/opt/spark/python/pyspark/context.py", line 144, in __init__
    SparkContext._ensure_initialized(self, gateway=gateway, conf=conf)
  File "/opt/spark/python/pyspark/context.py", line 339, in _ensure_initialized
    SparkContext._gateway = gateway or launch_gateway(conf)
  File "/opt/spark/python/pyspark/java_gateway.py", line 98, in launch_gateway
    proc = Popen(command, **popen_kwargs)
  File "/home/lai/anaconda3/lib/python3.9/site-packages/spylon_kernel/scala_interpreter.py", line 94, in Popen
    spark_jvm_proc = subprocess.Popen(*args, **kwargs)
  File "/home/lai/anaconda3/lib/python3.9/subprocess.py", line 951, in __init__
    self._execute_child(args, executable, preexec_fn, close_fds,
  File "/home/lai/anaconda3/lib/python3.9/subprocess.py", line 1821, in _execute_child
    raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: '/opt/spark/python/pyspark/./bin/spark-submit'

Thanks for your help!