Guidance on Implementing Image Classification with MediaPipeUnityPlugin

Question

Guidance on Implementing Image Classification with MediaPipeUnityPlugin

lvxixi4ever opened this issue 17 days ago · comments

lvxixi4ever commented 17 days ago

Plugin Version or Commit ID

v0.14.3

Unity Version

2022.3.27f1c1

Your Host OS

Windows 10

Target Platform

UnityEditor

Description

I would like to express my sincere gratitude for providing such a useful framework that allows me to utilize MediaPipe in Unity. I have a question regarding how to implement an image classification task using the MediaPipeUnityPlugin. Taking FaceLandmarkerRunner as an example, would I need to write a script similar to FaceLandmarkerRunner? Could you please provide a general idea or direction (such as which parts need to be overridden or customized)?
Thank you very much in advance.

Code to Reproduce the issue

No response

Additional Context

No response

Junrou Nishida · Answer 1 · Tue Jul 16 2024 19:04:42 GMT+0800 (China Standard Time)

Is it correct that you want to implement the Image Classification Task API?

If you are implementing it yourself, it would be useful to not only read the current source code but also review the Pull Requests from when previous Task APIs were added.

At the very least, you will need to:

Include the internally used Calculator in the library (cf. https://github.com/homuler/MediaPipeUnityPlugin/pull/992/files#diff-01c2854f2c65078fa2a2d333d98fcbf6647c0d6b15900800ac24b03af04ed92b)
Compile the protobuf for options passed to the Task Runner (cf. https://github.com/homuler/MediaPipeUnityPlugin/pull/992/files#diff-70333c42487c3bd171af9ed353785c87c250e575f4a71dc5c98d6584ed7dae50)
Implement the Task API by inheriting from Core.BaseVisionTaskApi

I think the output will be similar to the AudioClassifier, so it would be good to look at that as well.

Taking FaceLandmarkerRunner as an example, would I need to write a script similar to FaceLandmarkerRunner?

Essentially, the FaceLandmarker is more important, and the FaceLandmarkerRunner is just running it.

lvxixi4ever · Answer 2 · Tue Jul 16 2024 19:18:43 GMT+0800 (China Standard Time)

Is it correct that you want to implement the Image Classification Task API?

If you are implementing it yourself, it would be useful to not only read the current source code but also review the Pull Requests from when previous Task APIs were added.

At the very least, you will need to:

Include the internally used Calculator in the library (cf. https://github.com/homuler/MediaPipeUnityPlugin/pull/992/files#diff-01c2854f2c65078fa2a2d333d98fcbf6647c0d6b15900800ac24b03af04ed92b)

Compile the protobuf for options passed to the Task Runner (cf. https://github.com/homuler/MediaPipeUnityPlugin/pull/992/files#diff-70333c42487c3bd171af9ed353785c87c250e575f4a71dc5c98d6584ed7dae50)

Implement the Task API by inheriting from Core.BaseVisionTaskApi

I think the output will be similar to the AudioClassifier, so it would be good to look at that as well.

Taking FaceLandmarkerRunner as an example, would I need to write a script similar to FaceLandmarkerRunner?

Essentially, the FaceLandmarker is more important, and the FaceLandmarkerRunner is just running it.

Thank you very much for your prompt response. I will try following your suggestions.

lvxixi4ever · Answer 3 · Wed Jul 17 2024 20:39:34 GMT+0800 (China Standard Time)

Is it correct that you want to implement the Image Classification Task API?

If you are implementing it yourself, it would be useful to not only read the current source code but also review the Pull Requests from when previous Task APIs were added.

At the very least, you will need to:

Include the internally used Calculator in the library (cf. https://github.com/homuler/MediaPipeUnityPlugin/pull/992/files#diff-01c2854f2c65078fa2a2d333d98fcbf6647c0d6b15900800ac24b03af04ed92b)

Compile the protobuf for options passed to the Task Runner (cf. https://github.com/homuler/MediaPipeUnityPlugin/pull/992/files#diff-70333c42487c3bd171af9ed353785c87c250e575f4a71dc5c98d6584ed7dae50)

Implement the Task API by inheriting from Core.BaseVisionTaskApi

I think the output will be similar to the AudioClassifier, so it would be good to look at that as well.

Taking FaceLandmarkerRunner as an example, would I need to write a script similar to FaceLandmarkerRunner?

Essentially, the FaceLandmarker is more important, and the FaceLandmarkerRunner is just running it.

Hi! In my Unity project, I directly imported the unitypackage file into the Assets folder without downloading the MediaPipeUnityPlugin-all.zip file. I saw in your response that I need to modify the BUILD file in the mediapipe_api directory. Therefore, I extracted MediaPipeUnityPlugin-all.zip, copied the mediapipe_api and third_party folders into my Unity project (in the same directory as Assets), and modified the related BUILD files. Is this the correct approach?

Additionally, I noticed that when you created the FaceLandmarker, you modified the third_party/mediapipe_visibility.diff file. Do I need to modify this file as well? I see entries like index aa839d91..8efc28e1 100644, which seem to be auto-generated.

looking forward to your response. Thank you!

Junrou Nishida · Answer 4 · Wed Jul 17 2024 20:59:33 GMT+0800 (China Standard Time)

Therefore, I extracted MediaPipeUnityPlugin-all.zip, copied the mediapipe_api and third_party folders into my Unity project (in the same directory as Assets), and modified the related BUILD files. Is this the correct approach?

No. You need to build the library by yourself after modifying the native code (see https://github.com/homuler/MediaPipeUnityPlugin?tab=readme-ov-file#hammer_and_wrench-installation).
You don't need to move existing files to another place.

Provided everything is done correctly, running the package workflow after making changes will build the library.

Additionally, I noticed that when you created the FaceLandmarker, you modified the third_party/mediapipe_visibility.diff file. Do I need to modify this file as well?

Maybe yes. This is related to the below comment.

Compile the protobuf for options passed to the Task Runner

Sometimes the visibility of a protobuf file is declared as internal and if you want to compile it, you need to apply a patch (of course, you can compile it by yourself and copy the generated file (i.e. *.cs) manually).
See #1122 to know how to generate the patch file.

lvxixi4ever · Answer 5 · Thu Jul 18 2024 10:39:32 GMT+0800 (China Standard Time)

Therefore, I extracted MediaPipeUnityPlugin-all.zip, copied the mediapipe_api and third_party folders into my Unity project (in the same directory as Assets), and modified the related BUILD files. Is this the correct approach?

No. You need to build the library by yourself after modifying the native code (see https://github.com/homuler/MediaPipeUnityPlugin?tab=readme-ov-file#hammer_and_wrench-installation). You don't need to move existing files to another place.

Provided everything is done correctly, running the package workflow after making changes will build the library.

Additionally, I noticed that when you created the FaceLandmarker, you modified the third_party/mediapipe_visibility.diff file. Do I need to modify this file as well?

Maybe yes. This is related to the below comment.

Compile the protobuf for options passed to the Task Runner

Sometimes the visibility of a protobuf file is declared as internal and if you want to compile it, you need to apply a patch (of course, you can compile it by yourself and copy the generated file (i.e. *.cs) manually). See #1122 to know how to generate the patch file.

Thank you for your response! I have a few more questions to ask. I have now rewritten ImageClassifier.cs (similar to Facelandmarker.cs), ImageClassifierOptions.cs (similar to FacelandmarkerOptions), ImageClassifierRunner.cs (similar to FacelandmarkerRunner.cs), and ImageClassificationConfig.cs (similar to FacelandmarkerDetectionConfig.cs) by following the landmarker framework. I generated ImageClassifierGraphOptions.cs using the protocol buffer compiler and placed it in the Packages\com.github.homuler.mediapipe\Runtime\Scripts\Protobuf\Tasks\Vision\ImageClassifier\Proto directory.

Now, when running ImageClassifierRunner.cs, I encounter an error at taskApi = ImageClassifier.CreateFromOptions(options): "BadStatusException: NOT_FOUND: ValidatedGraphConfig Initialization failed. No registered object with name: mediapipe::tasks::vision::image_classifier::ImageClassifierGraph; Unable to find Calculator 'mediapipe.tasks.vision.image_classifier.ImageClassifierGraph' [MediaPipeTasksStatus='601']". It seems like the Calculator (Graph) is not loading?

Does this issue relate to the need to create my own library as you mentioned? Should I debug the code first before creating my own library? Thank you!

Junrou Nishida · Answer 6 · Thu Jul 18 2024 11:37:43 GMT+0800 (China Standard Time)

Does this issue relate to the need to create my own library as you mentioned?

Yes. You need to include the dependent Calculators and build the library.
https://github.com/homuler/MediaPipeUnityPlugin/pull/992/files#diff-01c2854f2c65078fa2a2d333d98fcbf6647c0d6b15900800ac24b03af04ed92b

Should I debug the code first before creating my own library?

Sorry, I didn't understand the intention of this question.
If you haven't written the code in C++, I believe you can debug everything in Unity after rebuilding the library.

lvxixi4ever · Answer 7 · Thu Jul 18 2024 20:55:05 GMT+0800 (China Standard Time)

Does this issue relate to the need to create my own library as you mentioned?

Yes. You need to include the dependent Calculators and build the library. https://github.com/homuler/MediaPipeUnityPlugin/pull/992/files#diff-01c2854f2c65078fa2a2d333d98fcbf6647c0d6b15900800ac24b03af04ed92b

Should I debug the code first before creating my own library?

Sorry, I didn't understand the intention of this question. If you haven't written the code in C++, I believe you can debug everything in Unity after rebuilding the library.

Thank you for your response. I will continue to try based on your description. The issue mentioned above might be related to the setting of _TASK_GRAPH_NAME in the classifier. I noticed that in AudioClassifier, TASK_GRAPH_NAME is set as "mediapipe.tasks.audio.audio_classifier.AudioClassifierGraph". This value should be related to audio_classifier_graph.cc, and it seems to be passed to Calculator.cs (generated by the protocol buffer compiler) during runtime (calculator = pb::ProtoPreconditions. CheckNotNull(value, "value")). In my ImageClassifier.cs, I have already set _TASK_GRAPH_NAME to "mediapipe.tasks.vision.image_classifier.ImageClassifierGraph" according to the final line in image_classifier_graph.cc (::mediapipe::tasks::vision::image_classifier::ImageClassifierGraph). However, I still encounter the error "BadStatusException: NOT_FOUND: ValidatedGraphConfig Initialization failed. No registered object with name: mediapipe::tasks::vision::image_classifier::ImageClassifierGraph; Unable to find Calculator 'mediapipe.tasks.vision.image_classifier. ImageClassifierGraph' [MediaPipeTasksStatus='601']". This might be because I haven't created my own library. I will rebuild the library.

I have an additional question: how does Unity interact with the underlying C++ code? I installed MediaPipeUnityPlugin by directly importing MediaPipeUnityPlugin.unitypackage into the Assets folder. I noticed there is no C++ source code in it (e.g., audio_classifier_graph.cc), meaning there is no local C++ source code. So, how does Unity load the Graph or perform inference while interacting with the C++ code? Thanks！