Koalas on JDK 11 raises `java.lang.UnsupportedOperationException`
ashwin153 opened this issue · comments
Ashwin Madavan commented
- PyArrow + PySpark on JDK 11 raises
java.lang.UnsupportedOperationException: sun.misc.Unsafe or java.nio.DirectByteBuffer.<init>(long, int) not available
. - According to https://stackoverflow.com/a/62625252 this can be resolved by setting
spark.driver.extraJavaOptions="-Dio.netty.tryReflectionSetAccessible=true"
and
spark.executor.extraJavaOptions="-Dio.netty.tryReflectionSetAccessible=true"
in the SparkContext. I reproduced the problem and verified this solution on Spark / PySpark3.0.2
, Koalas1.5.0
, and openjdk11.0.10
. - Because I would imagine this to be a relatively common configuration (Java 11 is the
default-jdk
on Ubuntu 20.04 LTS, and Spark 3 is the latest version), I propose adding this configuration to the default_session. If there is a way to detect the JDK version from Python, then this additional configuration could be conditionally applied depending on the impactedLooseVersion(pyarrow.__version__)
,LooseVersion(pyspark.__version__)
, and JDK versions.
Hyukjin Kwon commented
Sure, that sounds making sense.
Ashwin Madavan commented
Do you know how to get the JDK version? If so, I'd be happy to put up a PR for this change.