Myriad: A Large Multimodal Model Applying Vision Experts for Industrial Anomaly Detection.

Myriad: A Large Multimodal Model Applying Vision Experts for Industrial Anomaly Detection. [Paper] [[HF](coming soon)]
Yuanze Li, Haolin Wang, Shihao Yuan, Ming Liu, Debin Zhao, Yiwen Guo, Chen Xu, Guangming Shi, Wangmeng Zuo

TODO

upload Myriad pre-trained weights.
update evaluation guidance.
update training guidance.

Install

Coming soon.

Myriad Weights

Myriad Weights are coming soon.

Demo

The demo code is coming soon.

Train

Training code is already in the repository. The two stage training guidance will be updated.

Evaluation

In Myriad, we evaluate models on the public benchmark for Anomaly Detection, MVTec and VisA. To ensure the reproducibility, we evaluate the models with greedy decoding. We do not evaluate using beam search.

Evaluate code is public now, and the guidance is coming soon in few days.

Citation

If you find Myriad useful for your research and applications, please cite using this BibTeX:

@article{Myriad,
  title={Myriad: Large multimodal model by applying vision experts for industrial anomaly detection},
  author={Li, Yuanze and Wang, Haolin and Yuan, Shihao and Liu, Ming and Zhao, Debin and Guo, Yiwen and Xu, Chen and Shi, Guangming and Zuo, Wangmeng},
  journal={arXiv preprint arXiv:2310.19070},
  year={2023}
}

Acknowledgement

MiniGPT-4: the codebase we built upon, and our base model MiniGPT4-v1. Thanks for their clear code base and help for reproduce!

Related Projects

About

Open-sourced codes, IAD vision-language datasets and pre-trained checkpoints for Myriad.

Languages

Language:Python 99.9%Language:Shell 0.1%

tzjtatata / Myriad