databrickslabs / dbx

๐Ÿงฑ Databricks CLI eXtensions - aka dbx is a CLI tool for development and advanced Databricks workflows management.

Home Page:https://dbx.readthedocs.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Token which works for databricks cli does not work for dbx commands

kenanzh opened this issue ยท comments

Expected Behavior

dbx commands should not raise invalid access token errors when they are valid with databricks cli.

Current Behavior

 % dbx deploy -e dev --assets-only                                                         
[dbx][2023-08-30 11:07:53.117] ๐Ÿ”Ž Deployment file is not provided, searching in the conf directory
[dbx][2023-08-30 11:07:53.118] ๐Ÿ’ก Auto-discovery found deployment file conf/deployment.yml
[dbx][2023-08-30 11:07:53.119] ๐Ÿ†— Deployment file conf/deployment.yml exists and will be used for deployment
[dbx][2023-08-30 11:07:53.119] Starting new deployment for environment dev
[dbx][2023-08-30 11:07:53.120] Using profile provided from the project file
[dbx][2023-08-30 11:07:53.120] Found auth config from provider DbxEnvironmentVariableConfigProvider, verifying it
[dbx][2023-08-30 11:07:53.121] Found auth config from provider DbxEnvironmentVariableConfigProvider, verification successful
โ•ญโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ Traceback (most recent call last) โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
โ”‚ /.../site-packages/databricks_cli/sdk/api_client.py:166 in perform_query                       โ”‚
โ”‚                                                                                                  โ”‚
โ”‚   163 โ”‚   โ”‚   โ”‚   โ”‚   โ”‚   resp = self.session.request(method, self.get_url(path, version=versi   โ”‚
โ”‚   164 โ”‚   โ”‚   โ”‚   โ”‚   โ”‚   โ”‚   โ”‚   โ”‚   โ”‚   โ”‚   โ”‚   โ”‚   verify = self.verify, headers = headers)   โ”‚
โ”‚   165 โ”‚   โ”‚   try:                                                                               โ”‚
โ”‚ โฑ 166 โ”‚   โ”‚   โ”‚   resp.raise_for_status()                                                        โ”‚
โ”‚   167 โ”‚   โ”‚   except requests.exceptions.HTTPError as e:                                         โ”‚
โ”‚   168 โ”‚   โ”‚   โ”‚   message = e.args[0]                                                            โ”‚
โ”‚   169 โ”‚   โ”‚   โ”‚   try:                                                                           โ”‚
โ”‚                                                                                                  โ”‚
โ”‚ /Users/kenan/Library/Caches/pypoetry/virtualenvs/datascience-databricks-WE8lb4aI-py3.10/lib/pyth โ”‚
โ”‚ on3.10/site-packages/requests/models.py:1021 in raise_for_status                                 โ”‚
โ”‚                                                                                                  โ”‚
โ”‚   1018 โ”‚   โ”‚   โ”‚   )                                                                             โ”‚
โ”‚   1019 โ”‚   โ”‚                                                                                     โ”‚
โ”‚   1020 โ”‚   โ”‚   if http_error_msg:                                                                โ”‚
โ”‚ โฑ 1021 โ”‚   โ”‚   โ”‚   raise HTTPError(http_error_msg, response=self)                                โ”‚
โ”‚   1022 โ”‚                                                                                         โ”‚
โ”‚   1023 โ”‚   def close(self):                                                                      โ”‚
โ”‚   1024 โ”‚   โ”‚   """Releases the connection back to the pool. Once this method has been            โ”‚
โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ
HTTPError: 403 Client Error: Forbidden for url: https://dbc-dafad129-c9ae.cloud.databricks.com/api/2.0/workspace/mkdirs

During handling of the above exception, another exception occurred:

โ•ญโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ Traceback (most recent call last) โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
โ”‚ /.../site-packages/dbx/commands/deploy.py:104 in deploy                                        โ”‚
โ”‚                                                                                                  โ”‚
โ”‚   101 โ”‚   else:                                                                                  โ”‚
โ”‚   102 โ”‚   โ”‚   headers = None                                                                     โ”‚
โ”‚   103 โ”‚                                                                                          โ”‚
โ”‚ โฑ 104 โ”‚   api_client = prepare_environment(environment_name, headers)                            โ”‚
โ”‚   105 โ”‚   additional_tags = parse_multiple(tags)                                                 โ”‚
โ”‚   106 โ”‚   no_rebuild = no_rebuild or no_package                                                  โ”‚
โ”‚   107                                                                                            โ”‚
โ”‚                                                                                                  โ”‚
โ”‚ /.../site-packages/dbx/utils/common.py:31 in prepare_environment                               โ”‚
โ”‚                                                                                                  โ”‚
โ”‚   28 def prepare_environment(env_name: str, headers: Optional[Dict[str, str]] = None) -> ApiC    โ”‚
โ”‚   29 โ”‚   info = ProjectConfigurationManager().get(env_name)                                      โ”‚
โ”‚   30 โ”‚   transfer_profile_name(info)                                                             โ”‚
โ”‚ โฑ 31 โ”‚   MlflowStorageConfigurationManager.prepare(info)                                         โ”‚
โ”‚   32 โ”‚   return DatabricksClientProvider.get_v2_client(headers)                                  โ”‚
โ”‚   33                                                                                             โ”‚
โ”‚   34                                                                                             โ”‚
โ”‚                                                                                                  โ”‚
โ”‚ /.../site-packages/dbx/api/storage/mlflow_based.py:31 in prepare                               โ”‚
โ”‚                                                                                                  โ”‚
โ”‚   28 โ”‚   @classmethod                                                                            โ”‚
โ”‚   29 โ”‚   def prepare(cls, properties: EnvironmentInfo):                                          โ”‚
โ”‚   30 โ”‚   โ”‚   cls._setup_tracking_uri()                                                           โ”‚
โ”‚ โฑ 31 โ”‚   โ”‚   cls._prepare_workspace_dir(properties)                                              โ”‚
โ”‚   32 โ”‚   โ”‚   cls._setup_experiment(properties)                                                   โ”‚
โ”‚   33 โ”‚                                                                                           โ”‚
โ”‚   34 โ”‚   @classmethod                                                                            โ”‚
โ”‚                                                                                                  โ”‚
โ”‚ /.../site-packages/dbx/api/storage/mlflow_based.py:39 in _prepare_workspace_dir                โ”‚
โ”‚                                                                                                  โ”‚
โ”‚   36 โ”‚   โ”‚   api_client = DatabricksClientProvider().get_v2_client()                             โ”‚
โ”‚   37 โ”‚   โ”‚   p = str(PurePosixPath(env.properties.workspace_directory).parent)                   โ”‚
โ”‚   38 โ”‚   โ”‚   service = WorkspaceService(api_client)                                              โ”‚
โ”‚ โฑ 39 โ”‚   โ”‚   service.mkdirs(p)                                                                   โ”‚
โ”‚   40 โ”‚                                                                                           โ”‚
โ”‚   41 โ”‚   @staticmethod                                                                           โ”‚
โ”‚   42 โ”‚   def _get_experiment_safe(name: str) -> Optional[Experiment]:                            โ”‚
โ”‚                                                                                                  โ”‚
โ”‚ /.../site-packages/databricks_cli/sdk/service.py:1007 in mkdirs                                โ”‚
โ”‚                                                                                                  โ”‚
โ”‚   1004 โ”‚   โ”‚   _data = {}                                                                        โ”‚
โ”‚   1005 โ”‚   โ”‚   if path is not None:                                                              โ”‚
โ”‚   1006 โ”‚   โ”‚   โ”‚   _data['path'] = path                                                          โ”‚
โ”‚ โฑ 1007 โ”‚   โ”‚   return self.client.perform_query('POST', '/workspace/mkdirs', data=_data, header  โ”‚
โ”‚   1008 โ”‚                                                                                         โ”‚
โ”‚   1009 โ”‚   def list(self, path, headers=None):                                                   โ”‚
โ”‚   1010 โ”‚   โ”‚   _data = {}                                                                        โ”‚
โ”‚                                                                                                  โ”‚
โ”‚ /.../site-packages/databricks_cli/sdk/api_client.py:174 in perform_query                       โ”‚
โ”‚                                                                                                  โ”‚
โ”‚   171 โ”‚   โ”‚   โ”‚   โ”‚   message += '\n Response from server: \n {}'.format(reason)                 โ”‚
โ”‚   172 โ”‚   โ”‚   โ”‚   except ValueError:                                                             โ”‚
โ”‚   173 โ”‚   โ”‚   โ”‚   โ”‚   pass                                                                       โ”‚
โ”‚ โฑ 174 โ”‚   โ”‚   โ”‚   raise requests.exceptions.HTTPError(message, response=e.response)              โ”‚
โ”‚   175 โ”‚   โ”‚   return resp.json()                                                                 โ”‚
โ”‚   176                                                                                            โ”‚
โ”‚   177                                                                                            โ”‚
โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ
HTTPError: 403 Client Error: Forbidden for url: <my-host-redacted>/api/2.0/workspace/mkdirs
 Response from server: 
 {'error_code': '403', 'message': 'Invalid access token.'}

But I can list for example the available cluster spark versions

% databricks clusters spark-versions --profile dev
{
  "versions": [
    {
      "key": "12.2.x-scala2.12",
      "name": "12.2 LTS (includes Apache Spark 3.3.2, Scala 2.12)"
    },
    {
      "key": "11.3.x-photon-scala2.12",
      "name": "11.3 LTS Photon (includes Apache Spark 3.3.0, Scala 2.12)"
    },
...

Steps to Reproduce (for bugs)

Generate a new token using databricks ui
initialize new project using dbx init
configure a profile using token
execute dbx command

Context

I am trying to use dbx commands to deploy a job using --from-assets as part of our development workflow. This worked for me until my PAT expired. I generated a new one following this databricks documentation. I've replaced the token in my databrickscfg profile, and confirmed I can run databricks commands like listing spark versions and the workspace filesystem. However when trying to run dbx commands such as deploy or repo sync, I get {'error_code': '403', 'message': 'Invalid access token.'}.

I don't think I've missed anything, but hope its user error!

Your Environment

  • dbx version used: 0.8.15
  • Databricks Runtime version: 13.2

Hello @kenanzh,
Have you tried to reconfigure your dbx? It might be caused by a new update of the databricks-cli.
Please give it a try and let us know.

commented

I have a smiliar problem, but slightly different:

For one of my workspace I get the following error message when I call dbx deploy workflowname -e PROD:

HTTPError: 401 Client Error: Unauthorized for url: https://adb-xxxxx.10.azuredatabricks.net/api/2.0/workspace/mkdirs

This puzzles me as I can use the same token if I directly call

https://adb-xxxxx.10.azuredatabricks.net/api/2.0/workspace/mkdirs

via Postman as post request. And it seems to work via dbx for another workspace. How is that possible? Could it be caused by some workspace settings? Or do I have to give some permissions somewhere to the service principals that I use to run my workflows?

Thanks for any hints.

commented

Ok, my problem was a protected variable in gitlab which can used by pipelines on protected branches or tags. And it makes sense as this precaution or versioning does not take place in the test environment.

@AnastasiaProkaieva Sorry for the slow response here.

I have tried going through the dbx configure process again, but still the same issue. It's weird, as I can successfully run dbx sync repo ... commands but I cannot run dbx deploy .... I can share my project.json file with some redactions if it helps!