dotnet / spark

.NET for Apache® Spark™ makes Apache Spark™ easily accessible to .NET developers.

Home Page:https://dot.net/spark

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Error Trying to load json file from spark with delta enabled

relcodedev opened this issue · comments

I am trying to get .net spark to write delta table. Is this even possible?

I have done the tutorial that works great to write to a parquet format.
I tried researching how to write a delta table but am unable to get it working.

I tried running

spark-submit --jars ./bin/Release/delta-app/delta-core_2.13-2.1.0.jar --conf spark.sql.extensions=io.delta.sql.DeltaSparkSessionExtension --conf spark.sql.catalog.spark_catalog=org.apache.spark.sql.delta.catalog.DeltaCatalog --class org.apache.spark.deploy.dotnet.DotnetRunner --master local ./bin/Release/delta-app/microsoft-spark-3-2_2.12-2.1.1.jar dotnet delta-app.dll

ERROR DotnetBackendHandler: Failed to execute 'read' on 'org.apache.spark.sql.SparkSession' with args=()
[2022-09-27T14:57:59.6545202Z] [pop] [Error] [JvmBridge] JVM method execution failed: Nonstatic method 'read' failed for class '6' when called with no arguments
[2022-09-27T14:57:59.6545754Z] [pop] [Error] [JvmBridge] java.lang.NoClassDefFoundError: scala/collection/SeqOps

Seems that it fails on spark.Read()

Code in Main function:

    SparkSession spark =
        SparkSession
            .Builder()
            .AppName("JsonToDelta")
            .GetOrCreate();
 
    var path = "./bin/Release/delta-app/docs.jsonl";

    var docDF = spark.Read().Json(path);

    docDF.Write()
      .Format("delta").Mode("overwrite")
      .Save("doc");

Closing ticket, issue with version of delta.io package included.