capeprivacy / cape-dataframes

Privacy transformations on Spark and Pandas dataframes backed by a simple policy language.

Home Page:https://docs.capeprivacy.com

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Spark UserWarning for Deprecated UDF Feature

kjam opened this issue · comments

Describe the bug
When using Cape Python with PySpark in Spark 3.0.0 there is a warning message for typehints and a note that we are using a deprecated UDF feature/

/usr/local/spark/python/pyspark/sql/pandas/functions.py:386: UserWarning: In Python 3.6+ and Spark 3.0+, it is preferred to specify type hints for pandas UDF instead of specifying pandas UDF type which will be deprecated in the future releases. See SPARK-28264 for more details. "in the future releases. See SPARK-28264 for more details.", UserWarning)

To Reproduce
Follow along with the IoT example notebook on a Spark 3.0.0 installation.

Expected behavior
No Warning message - we should try to match Spark's latest API if that is the recommended version to use with Cape Python.

Screenshots
If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

  • OS: Ubuntu
  • OS Version: 18.04.4
  • Python Version: 3.6.9
  • Installed pip packages: cape!

Additional context
I am running Spark latest 3.0.0 package for linux.