mlflow / mlflow-export-import

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Bulk import fails with SQLAlchemy store

pimdh opened this issue · comments

Hi, it appears there's a but that prevents me from importing to the SQLAlchemy store, with a postgresql DB.

I get the following error:

pim@project:~/project$ import-experiment   --experiment-name EXPERIMENT-NAME  --input-dir /tmp/export
2023/09/14 02:08:28 INFO mlflow.store.db.utils: Creating initial MLflow database tables...
2023/09/14 02:08:28 INFO mlflow.store.db.utils: Updating database tables
INFO  [alembic.runtime.migration] Context impl PostgresqlImpl.
INFO  [alembic.runtime.migration] Will assume transactional DDL.
INFO  [alembic.runtime.migration] Running upgrade  -> 451aebb31d03, add metric step
INFO  [alembic.runtime.migration] Running upgrade 451aebb31d03 -> 90e64c465722, migrate user column to tags
INFO  [alembic.runtime.migration] Running upgrade 90e64c465722 -> 181f10493468, allow nulls for metric values
INFO  [alembic.runtime.migration] Running upgrade 181f10493468 -> df50e92ffc5e, Add Experiment Tags Table
INFO  [alembic.runtime.migration] Running upgrade df50e92ffc5e -> 7ac759974ad8, Update run tags with larger limit
INFO  [alembic.runtime.migration] Running upgrade 7ac759974ad8 -> 89d4b8295536, create latest metrics table
INFO  [89d4b8295536_create_latest_metrics_table_py] Migration complete!
INFO  [alembic.runtime.migration] Running upgrade 89d4b8295536 -> 2b4d017a5e9b, add model registry tables to db
INFO  [2b4d017a5e9b_add_model_registry_tables_to_db_py] Adding registered_models and model_versions tables to database.
INFO  [2b4d017a5e9b_add_model_registry_tables_to_db_py] Migration complete!
INFO  [alembic.runtime.migration] Running upgrade 2b4d017a5e9b -> cfd24bdc0731, Update run status constraint with killed
INFO  [alembic.runtime.migration] Running upgrade cfd24bdc0731 -> 0a8213491aaa, drop_duplicate_killed_constraint
INFO  [alembic.runtime.migration] Running upgrade 0a8213491aaa -> 728d730b5ebd, add registered model tags table
INFO  [alembic.runtime.migration] Running upgrade 728d730b5ebd -> 27a6a02d2cf1, add model version tags table
INFO  [alembic.runtime.migration] Running upgrade 27a6a02d2cf1 -> 84291f40a231, add run_link to model_version
INFO  [alembic.runtime.migration] Running upgrade 84291f40a231 -> a8c4a736bde6, allow nulls for run_id
INFO  [alembic.runtime.migration] Running upgrade a8c4a736bde6 -> 39d1c3be5f05, add_is_nan_constraint_for_metrics_tables_if_necessary
INFO  [alembic.runtime.migration] Running upgrade 39d1c3be5f05 -> c48cb773bb87, reset_default_value_for_is_nan_in_metrics_table_for_mysql
INFO  [alembic.runtime.migration] Running upgrade c48cb773bb87 -> bd07f7e963c5, create index on run_uuid
INFO  [alembic.runtime.migration] Running upgrade bd07f7e963c5 -> 0c779009ac13, add deleted_time field to runs table
INFO  [alembic.runtime.migration] Running upgrade 0c779009ac13 -> cc1f77228345, change param value length to 500
INFO  [alembic.runtime.migration] Running upgrade cc1f77228345 -> 97727af70f4d, Add creation_time and last_update_time to experiments table
INFO  [alembic.runtime.migration] Running upgrade 97727af70f4d -> 3500859a5d39, Add Model Aliases table
INFO  [alembic.runtime.migration] Running upgrade 3500859a5d39 -> 7f2a7d5fae7d, add datasets inputs input_tags tables
INFO  [alembic.runtime.migration] Context impl PostgresqlImpl.
INFO  [alembic.runtime.migration] Will assume transactional DDL.
MLflow Tracking URI: postgresql+psycopg2://postgres:POSTGRESPW@/mlflow
Options:
  input_dir: /tmp/export
  experiment_name: EXPERIMENT-NAME
  import_source_tags: False
  use_src_user_id: False
  dst_notebook_dir: None
in_databricks: False
importing_into_databricks: False
MLflowClient: postgresql+psycopg2://postgres:POSTGRESPW@/mlflow
Created experiment 'EXPERIMENT-NAME' with location '/usr2/pim/equi-geo-algebra/mlruns/1'
Importing 60 runs into experiment 'EXPERIMENT-NAME' from '/tmp/export'
Importing run from '/tmp/export/0486209ec8b24748a3bfed37572cda7f'
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/engine/base.py", line 1900, in _execute_context
    self.dialect.do_execute(
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/engine/default.py", line 736, in do_execute
    cursor.execute(statement, parameters)
psycopg2.errors.UniqueViolation: duplicate key value violates unique constraint "experiments_name_key"
DETAIL:  Key (name)=(EXPERIMENT-NAME) already exists.


The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/mlflow/store/tracking/sqlalchemy_store.py", line 266, in create_experiment
    eid = session.query(SqlExperiment).filter_by(name=name).first().experiment_id
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/orm/query.py", line 2824, in first
    return self.limit(1)._iter().first()
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/orm/query.py", line 2916, in _iter
    result = self.session.execute(
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/orm/session.py", line 1662, in execute
    ) = compile_state_cls.orm_pre_session_exec(
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/orm/context.py", line 312, in orm_pre_session_exec
    session._autoflush()
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/orm/session.py", line 2259, in _autoflush
    util.raise_(e, with_traceback=sys.exc_info()[2])
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/util/compat.py", line 211, in raise_
    raise exception
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/orm/session.py", line 2248, in _autoflush
    self.flush()
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/orm/session.py", line 3444, in flush
    self._flush(objects)
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/orm/session.py", line 3584, in _flush
    transaction.rollback(_capture_exception=True)
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/util/langhelpers.py", line 70, in __exit__
    compat.raise_(
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/util/compat.py", line 211, in raise_
    raise exception
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/orm/session.py", line 3544, in _flush
    flush_context.execute()
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/orm/unitofwork.py", line 456, in execute
    rec.execute(self)
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/orm/unitofwork.py", line 630, in execute
    util.preloaded.orm_persistence.save_obj(
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/orm/persistence.py", line 245, in save_obj
    _emit_insert_statements(
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/orm/persistence.py", line 1238, in _emit_insert_statements
    result = connection._execute_20(
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/engine/base.py", line 1705, in _execute_20
    return meth(self, args_10style, kwargs_10style, execution_options)
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/sql/elements.py", line 334, in _execute_on_connection
    return connection._execute_clauseelement(
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/engine/base.py", line 1572, in _execute_clauseelement
    ret = self._execute_context(
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/engine/base.py", line 1943, in _execute_context
    self._handle_dbapi_exception(
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/engine/base.py", line 2124, in _handle_dbapi_exception
    util.raise_(
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/util/compat.py", line 211, in raise_
    raise exception
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/engine/base.py", line 1900, in _execute_context
    self.dialect.do_execute(
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/engine/default.py", line 736, in do_execute
    cursor.execute(statement, parameters)
sqlalchemy.exc.IntegrityError: (raised as a result of Query-invoked autoflush; consider using a session.no_autoflush block if this flush is occurring prematurely)
(psycopg2.errors.UniqueViolation) duplicate key value violates unique constraint "experiments_name_key"
DETAIL:  Key (name)=(EXPERIMENT-NAME) already exists.

[SQL: INSERT INTO experiments (name, artifact_location, lifecycle_stage, creation_time, last_update_time) VALUES (%(name)s, %(artifact_location)s, %(lifecycle_stage)s, %(creation_time)s, %(last_update_time)s) RETURNING experiments.experiment_id]
[parameters: {'name': 'EXPERIMENT-NAME', 'artifact_location': None, 'lifecycle_stage': 'active', 'creation_time': 1694682509062, 'last_update_time': 1694682509062}]
(Background on this error at: https://sqlalche.me/e/14/gkpj)

During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/usr/local/bin/import-experiment", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.8/dist-packages/click/core.py", line 1128, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/click/core.py", line 1053, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.8/dist-packages/click/core.py", line 1395, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.8/dist-packages/click/core.py", line 754, in invoke
    return __callback(*args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/mlflow_export_import/experiment/import_experiment.py", line 94, in main
    importer.import_experiment(experiment_name, input_dir, dst_notebook_dir)
  File "/usr/local/lib/python3.8/dist-packages/mlflow_export_import/experiment/import_experiment.py", line 67, in import_experiment
    dst_run, src_parent_run_id = self.run_importer.import_run(exp_name, os.path.join(input_dir, src_run_id), dst_notebook_dir)
  File "/usr/local/lib/python3.8/dist-packages/mlflow_export_import/run/import_run.py", line 69, in import_run
    res = self._import_run(exp_name, input_dir, dst_notebook_dir)
  File "/usr/local/lib/python3.8/dist-packages/mlflow_export_import/run/import_run.py", line 75, in _import_run
    exp_id = mlflow_utils.set_experiment(self.mlflow_client, self.dbx_client, dst_exp_name)
  File "/usr/local/lib/python3.8/dist-packages/mlflow_export_import/common/mlflow_utils.py", line 50, in set_experiment
    exp_id = mlflow_client.create_experiment(exp_name, tags=tags)
  File "/usr/local/lib/python3.8/dist-packages/mlflow/tracking/client.py", line 557, in create_experiment
    return self._tracking_client.create_experiment(name, artifact_location, tags)
  File "/usr/local/lib/python3.8/dist-packages/mlflow/tracking/_tracking_service/client.py", line 236, in create_experiment
    return self.store.create_experiment(
  File "/usr/local/lib/python3.8/dist-packages/mlflow/store/tracking/sqlalchemy_store.py", line 269, in create_experiment
    raise MlflowException(
mlflow.exceptions.MlflowException: Experiment(name=EXPERIMENT-NAME) already exists. Error: (raised as a result of Query-invoked autoflush; consider using a session.no_autoflush block if this flush is occurring
 prematurely)
(psycopg2.errors.UniqueViolation) duplicate key value violates unique constraint "experiments_name_key"
DETAIL:  Key (name)=(EXPERIMENT-NAME) already exists.

[SQL: INSERT INTO experiments (name, artifact_location, lifecycle_stage, creation_time, last_update_time) VALUES (%(name)s, %(artifact_location)s, %(lifecycle_stage)s, %(creation_time)s, %(last_update_t
ime)s) RETURNING experiments.experiment_id]
[parameters: {'name': 'EXPERIMENT-NAME', 'artifact_location': None, 'lifecycle_stage': 'active', 'creation_time': 1694682509062, 'last_update_time': 1694682509062}]
(Background on this error at: https://sqlalche.me/e/14/gkpj)

This issue appears to come from the fact that the exception handler here only handles a RESTException from the REST store, not a MlFlowException from the SQLAlchemy store.