sparklyr / sparklyr

R interface for Apache Spark

Home Page:https://spark.rstudio.com/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Copy files to Databricks Managed Volumes

Zurina opened this issue · comments

Hi,

I'm wondering if Sparklyr will make it available to copy local files to Databricks Managed Volumes like Databricks Connect allows for in Python: https://pypi.org/project/databricks-connect/

In Databricks Connect Python, when the spark session object is created, you can call a function called "copyFromLocalToFs", which will copy a local file to a folder location in Databricks, like in a managed volume.

Example of code doing so in Python:

from databricks.connect import DatabricksSession

workspace_url = "<workspace-url>"
cluster_id = "<cluster-id>"

session = DatabricksSession.builder\
    .remote(f"sc://{workspace_url};token={<oauth-token/pat>};x-databricks-cluster-id={cluster_id}")\
    .getOrCreate()

session.copyFromLocalToFs("./google.png", "/Volumes/main/my_schema/my-volume/google.png")

Hi @Zurina , thank you the request, I'm going to move this issue to the pysparklyr repo so I can keep track: mlverse/pysparklyr#54