to_datetime() error
franperezlopez opened this issue · comments
franperezlopez commented
I'm using the method koalas.to_datetime() to cast a string as a datetime. This is the code:
df = ks.DataFrame({'timestamp': ['2020-04-06', '2020-04-06']})
df.timestamp = ks.to_datetime(df.timestamp)
df.to_pandas()
executing the second line, you get this warning:
/home/fran/anaconda3/envs/----/lib/python3.7/site-packages/pyspark/sql/pandas/functions.py:386: UserWarning: In Python 3.6+ and Spark 3.0+, it is preferred to specify type hints for pandas UDF instead of specifying pandas UDF type which will be deprecated in the future releases. See SPARK-28264 for more details.
"in the future releases. See SPARK-28264 for more details.", UserWarning)
executing the third line (to_pandas()
), an exception is thrown ... is this a bug or should I change the way of using to_datetime()??
Py4JJavaError: An error occurred while calling o153.collectToPython.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 3 in stage 0.0 failed 1 times, most recent failure: Lost task 3.0 in stage 0.0 (TID 3, 172.28.1.237, executor driver): java.lang.UnsupportedOperationException: sun.misc.Unsafe or java.nio.DirectByteBuffer.<init>(long, int) not available
Hyukjin Kwon commented
The error:
sun.misc.Unsafe or java.nio.DirectByteBuffer.<init>(long, int) not available
likely from the JVM version and Arrow issue. You will have to add -Dio.netty.tryReflectionSetAccessible=true
. See also https://spark.apache.org/docs/3.0.0/index.html#downloading
Jared Zhao commented
Any updates here? I am seeing this issue as well.