allegroai / clearml

ClearML - Auto-Magical CI/CD to streamline your AI workload. Experiment Management, Data Management, Pipeline, Orchestration, Scheduling & Serving in one MLOps/LLMOps solution

Home Page:https://clear.ml/docs

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Dataset uploading error

dmg-ai opened this issue · comments

Describe the bug

Trying to upload a new version of dataset (the first version has already been downloaded), but getting the error (never happened before):

2024-05-06 14:21:37,529 - clearml.Task - ERROR - Action failed <400/801: projects.create/v1.0 (Value combination already exists (unique field already contains this value): name=new_project/ocr/.datasets/aocr_dataset_v2, company=***)> (name=/new_project/ocr/.datasets/aocr_dataset_v2, description=)
2024-05-06 14:21:37,622 - clearml.Task - ERROR - Action failed <400/801: projects.create/v1.0 (Value combination already exists (unique field already contains this value): name=new_project/ocr/.datasets/aocr_dataset_v2, company=***)> (name=/new_project/ocr/.datasets/aocr_dataset_v2, description=)
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/clearml/backend_interface/util.py", line 89, in get_or_create_project
    return _get_or_create_project(session, project_name, description=description, system_tags=system_tags, project_id=project_id)
  File "/usr/local/lib/python3.8/site-packages/clearml/backend_interface/util.py", line 130, in _get_or_create_project
    res = session.send(
  File "/usr/local/lib/python3.8/site-packages/clearml/backend_interface/base.py", line 113, in send
    return self._send(session=self.session, req=req, ignore_errors=ignore_errors, raise_on_errors=raise_on_errors,
  File "/usr/local/lib/python3.8/site-packages/clearml/backend_interface/base.py", line 107, in _send
    raise SendError(res, error_msg)
clearml.backend_interface.session.SendError: Action failed <400/801: projects.create/v1.0 (Value combination already exists (unique field already contains this value): name=new_project/ocr/.datasets/aocr_dataset_v2, company=***)> (name=/new_project/ocr/.datasets/aocr_dataset_v2, description=)

To reproduce

from clearml import Dataset

if __name__ == "__main__":
    dataset = Dataset.create(
        dataset_name="aocr_dataset_v2",
        dataset_project="/new_project/ocr",
        description="The 3/5 parts of the source aocr_dataset_v2 dataset.",)
    dataset.add_files("/project/ocr/ocr_ready_dataset")
    dataset.upload()
    dataset.finalize()

Expected behaviour

Uploaded dataset.

Environment

  • Server type: self hosted
  • ClearML SDK Version: clearml==1.12.1 (also tried clearml==1.15.1 - getting the same error)
  • ClearML Server Version: WebApp: 1.10.0-357 • Server: 1.10.0-357 • API: 2.24
  • Python Version: 3.8.18
  • OS: Ubuntu 20.04