devlace / azure-databricks-storage

Different ways to connect to storage in Azure Databricks

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Different ways to connect to storage in Azure Databricks

The following is a summary of the various ways to connect to Blob Storage and Azure Data Lake Gen2 from Azure Databricks.

To download all sample notebooks, here is the DBC archive you can import to your workspace.

How to connect Scope of connection Authentication Authorization Requirements Code Sample Docs/Supported Storage
Direct connect Typicaly SparkSession* Storage Key All rights Python, SQL Blob
OAuth via Service Principal (SP) **SP has correct RBAC role assigned OR ACLs permissions to files/folders in ADLS Gen2 Python, SQL ADLS Gen2
AD Passthrough **User has correct RBAC role assigned OR ACLs permissions to files/folders in ADLS Gen2 Python, SQL ADLS Gen2
Mount on DBFS Databricks Workspace Storage Key All rights Python Blob, ADLS Gen2
OAuth via Service Principal (SP) **SP has correct RBAC role assigned OR ACLs permissions to files/folders in ADLS Gen2 Python ADLS Gen2
--- --- --- --- --- ---

*This will depend on where Spark Configuration is set. This is typically set on the SparkSession of the running notebook and therefore scoped to only that SparkSession.

**IMPORTANT NOTE on Authorization requirements

You need to assign specifically either of the following RBAC roles to the Service Principal or User. See here for more information.

  • Storage Blob Data Owner
  • Storage Blob Data Contributor
  • Storage Blob Data Reader

NOTE: Owner/Contributor role is insufficient.

For more granular access control, you can use use ACLs on folders/files in the ADLS Gen2 Filesystem.

Azure Databricks Secrets

All examples do not make sure of Azure Databricks secrets for simplicity.

Azure Databricks Secrets is the recommended way to store sensitive information in Azure Databricks. Essentially, you create Secret Scopes where you can store secrets in. Permissions are managed at the Secret Scope level. Users with the correct permission to a particular scope can retrieve secrets within it.

There are two types of Secret Scopes:

About

Different ways to connect to storage in Azure Databricks


Languages

Language:Python 100.0%