📫 Feedback on usage issues: contact.viesc@gmail.com / Professional consulting for server equipment: tensor.station@gmail.com
English | 简体中文
If you wish to refer to YMIR in your work, please use the following BibTeX entry.
@inproceedings{huang2021ymir,
title={YMIR: A Rapid Data-centric Development Platform for Vision Applications},
author={Phoenix X. Huang and Wenze Hu and William Brendel and Manmohan Chandraker and Li-Jia Li and Xiaoyu Wang},
booktitle={Proceedings of the Data-Centric AI Workshop at NeurIPS},
year={2021},
}
Version 2.0.0 updated on 11/08/2022
YMIR platform
- A new model performance diagnosis module.
- A new function for visual evaluation of model inference results.
- Adding a public algorithm library with a variety of built-in high-precision algorithms.
- One-click deployment function, supporting the deployment of algorithms to prerequisite certified devices.
- New operating instruction.
- Refactory code structure.
Docker
- Support yolov5
- Support mmdetection
- Support yolov7
- Support detectron2
- Support nanodet
- Support vidt: An Extendable, Efficient and Effective Transformer-based Object Detector
- Support ymir image testing tool library
- Support demo sample image creation documentation
- Support ymir image development extension library
View more ymir-executor-fork
Within the public dockerimage
- Update yolov5 training image: youdaoyzbx/ymir-executor:ymir2.0.0-yolov5-cu111-tmi
- Update mmdetection training image: youdaoyzbx/ymir-executor:ymir2.0.0-mmdet-cu111-tmi
- Update yolov5 image with rv1126 chip deployment support: youdaoyzbx/ymir-executor:ymir2.0.0-yolov5-cu111-tmid
More code updates ymir-dev.
YMIR supports deploying the trained model and public algorithm model directly to the certified device, for more hardware specs, please check the details.
Catalog
- Citations
- What's New
- Depolyment Prerequisite (optional)
- 1. Introduction to AI SUITE-YMIR
- 2. Installation
- 3. Use YMIR-GUI: typical model production process
- 4. For advanced users: YMIR-CMD (command line) user's guide
- 5. Get the code
- 6. Design concept
- 7. MISC
As a streamlined model development product, YMIR(You Mine In Recursion) focuses on the dataset versioning and model iteration in the AI SUITE open-source series.
AI commercialization is currently reaching a stage of maturity in terms of computing hardwares, algorithms, etc. The adoption of AI often encounter challenges such as a lack of skilled developers, high development costs and long iteration cycles.
As a platform, YMIR provides an end-to-end AI development system. This platform reduces costs for companies using artificial intelligence and accelerates the adoption of artificial intelligence. YMIR provides ML developers with one-stop services for data processing, model training, and other steps required in the AI development cycle.
The YMIR platform provides effective model development capabilities with a data-centric approach. The platform integrates concepts such as active learning methods, data and model version control, and workspace. Parallel rapid iteration of data sets and projects for multiple specific tasks are realized by YMIR. The platform uses an open API design, so third-party tools can also be integrated into the system.
A typical model development process can usually be summarized in a few steps: defining the task, preparing the data, training the model, evaluating the model, and deploying the model.
-
Define the target: Before starting an AI development project, it is important to be clear about what is to be analyzed. This will help developers correctly convert the problem into several typical ML modeling tasks, such as image classification, object detection, etc. Different tasks have different data requirements.
-
Prepare data: Data preparation is the foundation of a successful AI project. The most important task in this step is to ensure the quality of data and its annotations. Collect all the required data at the beginning is the optimal situation for many projects. Therefore, the project developer may find that some data is missing in subsequent stages. Additional data could be necessary upon the project needs.
-
Train model: This operation is commonly referred to "modeling". This step refers to the exploration and analysis of prepared data to discover the internal patterns and any links between the input and the expected prediction target. The result of this step is usually one or more machine learning models. These models can be applied to new data to obtain predictions. Developers train their own models using mainstream model training frameworks, such as pytorch, tensorflow, darknet, etc.
-
Evaluate model: The entire development process has not yet been completed after training the model. Models need to be evaluated and checked before being put into production. Normally, get a production-quality model all at once is not so easy. You need to adjust parameters, and iterate the model continuously. Some common metrics can help you evaluate models quantitatively and pick a satisfactory model. Some common metrics can help you to evaluate models quantitatively.
-
Deploy model: Models are developed and trained based on previously available data (possibly test data). After a satisfactory model is obtained, it will be applied to real data to make predictions at scale.
YMIR platform mainly meets the needs of users to produce models at scale, provides users with a good and easy-to-use display interface, and facilitates the management and viewing of data and models. The platform contains main functional modules such as project management, tag management, model deployment, system configuration, dockerimage management, etc. It supports the realization of the following main functions.
Function Module | Primary Function | Secondary Function | Function Description |
---|---|---|---|
Project Management | Project Management | Project Editing | Supports adding, deleting, and editing projects and project information |
Project Management | Iteration Management | Iteration Preparation | Supports setting up the dataset and model information needed for iteration |
Project Management | Iteration Management | Iteration Steps | Support to populate the data from the previous round to the next step corresponding to the task parameters |
Project Management | Iteration Management | Iteration Charts | Support to display the datasets and models generated during the iterative process in the interface as a graphical comparison |
Project Management | Dataset Management | Import datasets | Support users to import prepared datasets by copying public datasets, url addresses, paths, and local imports |
Project Management | Data Set Management | View Data Sets | Supports visualization of image data and annotations, and viewing of historical information |
Project Management | Data Set Management | Edit Data Set | Support editing and deleting data sets |
Project Management | Dataset Management | Dataset Versions | Support creating new dataset versions on the source dataset, with the version number incremented by time |
Project Management | Data Set Management | Data Preprocessing | Support image data fusion, filtering, sampling operations |
Project Management | Data Set Management | Data Mining | Supports finding the most beneficial data for model optimization in a large number of data sets |
Project Management | Data Set Management | Data Annotation | Support for adding annotations to image data |
Project Management | Data Set Management | Data Inference | Supports adding annotations to a data set by specifying a model |
Project Management | Model Management | Model Import | Support local import of model files to the platform |
Project Management | Model Management | Training Models | Support to select datasets, labels, and adjust training parameters to train models according to requirements, and view the corresponding model results after completion |
Project Management | Model Management | Model Validation | Support uploading a single image to check the performance of the model in real images through visualization to verify the accuracy of the model |
Tag management | Tag management | Add tags | Support adding primary names and aliases of training tags |
Model Deployment | Algorithm Management | Public Algorithm | Support algorithm customization, view public algorithms and try them out, support adding to my algorithms |
Model Deployment | Algorithm Management | Public Algorithm | Support publishing my algorithms to public algorithms |
Model Deployment | Algorithm Management | My Algorithms | Support for viewing and editing my published algorithms and added algorithms |
Model Deployment | Algorithm Management | Deploy Algorithms | Support deploying my algorithms to devices and viewing deployment history |
Model Deployment | Device Management | View Devices | Support viewing device information and deployment history |
Model Deployment | Device Management | Edit Device | Support adding, deploying, and deleting devices |
Model Deployment | Device Management | Support Devices | Support viewing and purchasing of supported devices |
System Configuration | Mirror Management | My Mirrors | Support for adding custom mirrors to the system (available to administrators only) |
System Configuration | Mirror Management | Public Mirror | Support for viewing public mirrors uploaded by others and copying them to your own system |
System Configuration | Permissions Configuration | Permissions Management | Support for configuring user permissions (available only to administrators) |
We provide an online trial version for your convenience. If you need, please fill out the Apply for YMIR Trial , and we will send the trial information to your email address.
How do users choose to install GUI or CMD?
-
The GUI verision with the supports of model training and model iteration is more suitable for ordinary users.
-
If you need to modify the default configuration of the system, it is recommended to install CMD;
-
If you have already deployed the existing version of ymir, please refer to the Upgrade Instructions.
This chapter contains the installation instructions for YMIR-GUI. If you need to use CMD, please refer to the Ymir-CMD user guide.
1.NVIDIA drivers shall be properly installed before installing YMIR. For detailed instructions, see https://www.nvidia.cn/geforce/drivers/.
- Docker and Docker Compose installation:
-
docker compose >= 1.29.2, docker >= 20.10
-
Installation of Docker and Docker Compose https://docs.docker.com/get-docker/
-
Installation of
nvidia-docker
nvidia-docker install-guide
## check the maximum CUDA version supported by the host
nvidia-smi
## for Host support cuda 11+, check nvidia-docker
sudo docker run --rm --gpus all nvidia/cuda:11.0.3-base-ubuntu20.04 nvidia-smi
## for Host support cuda 10+, check nvidia-docker
sudo docker run --rm --gpus all nvidia/cuda:10.2-base-ubuntu18.04 nvidia-smi
## those commands should result in a console output shown below:
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.60.02 Driver Version: 510.60.02 CUDA Version: 11.6 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Tesla P4 Off | 00000000:0B:00.0 Off | 0 |
| N/A 62C P0 55W / 75W | 4351MiB / 7680MiB | 94% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 8132 C 4349MiB |
+-----------------------------------------------------------------------------+
- Hardware Suggestions
-
NVIDIA GeForce RTX 2080 Ti or higher is recommended.
-
The maximum CUDA version supported by the host >= 11.2
The user must ensure that all the conditions in Cuda environment dependencies have been met, otherwise the installation may fail.
The YMIR-GUI project package is on DockerHub and the steps to install and deploy YMIR are as follows:
- Clone the deployment project YMIR to the local server:
git clone https://github.com/IndustryEssentials/ymir.git
- If there is no available GPU and you need to install CPU mode, please change it to CPU boot mode by modifying the .env file to change the SERVER_RUNTIME parameter to runc:
# nvidia for gpu, runc for cpu.
SERVER_RUNTIME=runc
- If you do not need to use the label free labeling platform, you can directly execute the start command with the default configuration:
bash ymir.sh start
.It is recommended not to use thesudo
command, otherwise it may cause insufficient privileges.
- When the service starts, it asks the user if they want to send usage reports to the YMIR development team, the default is yes if you do not enter it.
- If the user needs to use the label free labeling platform, you need to change the ip and port information in the .env configuration file to the address and port number of the labeling tool currently deployed by the user.
# Note format: LABEL_TOOL_HOST_IP=http(s)://(ip)
LABEL_TOOL_HOST_IP=set_your_label_tool_HOST_IP
LABEL_TOOL_HOST_PORT=set_your_label_tool_HOST_PORT
- The default port number for YMIR's Model Deployment module is 18801. If there is a conflict that needs to be modified, you need to go to the YMIR directory and modify the .env file to configure the ModelDeployment port and MySQL access password:
DEPLOY_MODULE_HOST_PORT=18801
DEPLOY_MODULE_URL=${DEPLOY_MODULE_HOST_PORT}
DEPLOY_MODULE_MYSQL_ROOT_PASSWORD=deploy_db_passwd
Execute the start command after the modification: bash ymir.sh start
.
-
After the service successfully started, YMIR will be available at http://localhost:12001/. If you need to stop the service, run the command:
bash ymir.sh stop
-
The default initial user is super administrator, you can check account and password through the .env file under the project path and modify it before deployment. It is recommended to change the password through the user management interface after the service deployment is completed.
Label Sudio is also an external labeling system supported by YMIR and can be installed as an alternative labeling tool.
- In the YMIR directory, modification Env file, configure label studio port:
LABEL_TOOL=label_studio
# Note format: LABEL_TOOL_HOST_IP=http(s)://(ip)
LABEL_TOOL_HOST_IP=set_your_label_tool_HOST_IP
LABEL_TOOL_HOST_PORT=set_your_label_tool_HOST_PORT
- After configuring the label tool (LABEL_TOOL), IP (LABEL_TOOL_HOST_IP), and port (LABEL_TOOL_HOST_PORT) start the installation of label studio command:
docker-compose -f docker-compose.label_studio.yml up -d
It is recommended not to use the sudo
command, as it may result in insufficient privileges.
- Check the status of label studio:
docker-compose -f docker-compose.label_studio.yml ps
The user can access label studio through the default URL http://localhost:12007/. The installation is successful if the login page shows up.
- Configure the Label Studio authorization token
After the user registers and log in to Label Studio, select "Account & Settings" in the upper right corner of the page. Then, copy the token value and paste it into the corresponding location (LABEL_STUDIO_TOKEN) in the .env configuration file of the YMIR project. An example is as follows:
label studio env
LABEL_TOOL_URL=http://(ip):(LABEL_TOOL_PORT)
LABEL_TOOL_PORT=set_your_label_tool_port
LABEL_TOOL_TOKEN="Token token_value"
LABEL_TASK_LOOP_SECONDS=60
Restart ymir after configuring the token (LABEL_STUDIO_TOKEN).
- The command to stop the label studio service is:
docker-compose -f docker-compose.label_studio.yml down
As shown in the figure, YMIR divides the model development process into multiple steps. Details about how to run each step are listed in the subsequent sections.
Data and labels are necessary for the training of deep learning, and the training requires a large amount of data with labels. However, what exists in reality is a large amount of unlabeled data, which is too costly in terms of labor and time if all of them are manually labeled.
Therefore, YMIR platform, through active learning, first attains an initial model by local import or a small amount of labeled data, and uses this initial model to mine the most beneficial data for model capability improvement from a large amount of data. After the mining is completed, only this part of the data is labeled and the original training dataset is expanded efficiently.
The updated dataset is used to train the model again to improve the model capability. The YMIR platform provides a more efficient approach than labeling the entire data and then training it, reducing the cost of labeling low-quality data. Through the cycle of mining, labeling, and training, high quality data is expanded and the model capability is improved.
This section uses a complete model iteration process as an example to illustrate how to use the YMIR platform. Please check Operating Instructions.
This chapter contains the instructions for the YMIR-CMD. If you need to use the GUI, please refer to Ymir-GUI Installation.
Mode I. Pip Installation
# Requires >= Python3.8.10
$ pip install ymir-cmd
$ mir --vesion
Mode II. Installation from the source
$ git clone --recursive https://github.com/IndustryEssentials/ymir.git
$ cd ymir/ymir/command
$ python setup.py clean --all install
$ mir --version
The above figure shows a typical process of model training: 1) the user prepares external data, 2) imports it into the system, 3) appropriately filters the data, and 4) begins training to obtain a model (possibly with low accuracy). 5) selects images in a dataset to be mined that are suitable for further training based on this model, 6) annotates these images, 7) merges the annotated results with the original training set, and 8) uses the merged results to run the training process again to obtain a better model. This section implement the process shown above using the command line. For details, please check the CMD usage instructions.
Any code in the YMIR repo should follow the coding standards and will be checked in the CI tests.
-
Functional code needs to be unit tested.
-
Use flake8 or black to format the code before committing. Both of these follow the PEP8 and Google Python Style style guides.
-
mypy - Python must go through static type checking.
Also check out MSFT Encoding Style for more advice.
Check this document for details.
We use the concept of code version control in Git to manage our data and models, use the concept of branches to create new projects so that different tasks on the same set of images can run in parallel. The additions, retrievals, updates, and deletions of datasets and basic operations are created by commits to branches. Logically, each commit stores an updated version of the dataset or new model, as well as the metadata of the operation that led to this change. Finally, only the data changes are merged into the main branch. This branch conceptually aggregates all the data annotated by many projects on the platform. Please see Life of a dataset for specific design concepts.
Why did the upload of the local dataset fail?
Regardless of whether the dataset has a label file, the images folder and annotations folder must be created. The images are placed in the images folder and the format is limited to jpg, jpeg, and png. The annotation files are placed in the annotations folder and the format is the pascal (when there is no annotation file, the folder is empty). Please put the images and annotations in the same folder and compress them into a ".zip" compressed package (not a .rar compressed format).
How should I obtain training and mining configuration files?
The default profile template needs to be extracted in the mirror.
The training image youdaoyzbx/ymir-executor:ymir2.0.0-yolov5-cu111-tmi
has a configuration file template located at: /img-man/training-template.yaml
Mining and inference mirrors The configuration file templates for youdaoyzbx/ymir-executor:ymir2.0.0-yolov5-cu111-tmi
are located at: /img-man/mining-template.yaml
(mining) and /img-man/infer-template. yaml
(infer).
How can the trained model be used outside the system?
After successful training, the system will output the ID of the model. The user can find the corresponding file according to this id at --model-location
. In fact, it is a tar file that can be extracted directly using the tar command to get the "mxnet" model file in parameters and JSON format.
How to solve the deployment, debugging and operation problems encountered in windows system?
It has not been fully tested on Windows server, so we cannot provide service support for the time being.
How to import models I've already trained?
See this document.