dotnet / spark

.NET for Apache® Spark™ makes Apache Spark™ easily accessible to .NET developers.

Home Page:https://dot.net/spark

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Question: can I use the spark extension to read a delta table?

funkysandman opened this issue · comments

Looking for help.
I have a delta table generated in databricks stored in ADLS Gen 2 storage.

Outside of databricks I'd like to read in the delta table to a .net app directly from ADLS Gen 2.
Is there an example of how I might accomplish this with this extension?

thanks for any help!

Using .NET to read a delta table you'll need Spark in the middle. If you have Spark installed in your local machine it won't work straightaway reading delta tables. Delta.io package needs to be installed inside Spark first, using a command similar to:

spark-shell --packages io.delta:delta-core_2.xx:2.x.x --conf "spark.sql.extensions=io.delta.sql.DeltaSparkSessionExtension" --conf "spark.sql.catalog.spark_catalog=org.apache.spark.sql.delta.catalog.DeltaCatalog"

where x is the version of Delta.io package that your Spark is compatible with.

In the current max supported version of Spark in ".NET for Spark" : 3.2.1, there is a bug that breaks with an exception the installation of the Delta.io package in Windows. It is fixed in Spark 3.2.2 but ".NET for Spark" doesn't support it yet.

@Niharikadutta
@AFFogarty
Could one of you help to release the support for .NET for Spark 3.2.2 with Delta.io package support?

Azure Synapse Runtime for Apache Spark 3.3 is now in Public Preview

Thx