[Feature request] GmsDocumentScanner API using InputImage like other MLKit libraries.

Question

[Feature request] GmsDocumentScanner API using InputImage like other MLKit libraries.

zagum opened this issue 3 months ago · comments

Zagumennyi Evgenii commented 3 months ago

What's your feature request? Please describe.

Hi! I'm working on a new feature for our app. I need to allow the user to take a photo of a menu using the camera inside the app (not with an external Camera Intent), then scan this photo to extract the menu with corrected perspective, and then proceed to the next steps. I tried your GmsDocumentScanner library; it does this work perfectly but only with an external camera, which does not work for our app.
It would be great if you could provide an API similar to your other libraries (e.g., TextRecognition) with an input parameter like InputImage and returning Task or something similar.

Mobile environment

Android

Additional context

Android Developer at Simple.Life Apps
https://play.google.com/store/apps/details?id=life.simple

steven · Answer 1 · Fri Mar 08 2024 01:47:52 GMT+0800 (China Standard Time)

Hi zagum,

Are you looking for an API to only do cropping with perspective correction (INPUT: photo with a menu and background, OUTPUT: cropped and perspective corrected Bitmap that only shows the menu part) OR you do need the entire scanner flow (including filter, auto-rotation, multi-page scanning and editing, export as PDF, etc, but just not the camera viewfinder)?

Does either work for you?

OR you actually strongly prefer one over another?

Zagumennyi Evgenii · Answer 2 · Fri Mar 08 2024 04:08:41 GMT+0800 (China Standard Time)

Hi zagum,

Are you looking for an API to only do cropping with perspective correction (INPUT: photo with a menu and background, OUTPUT: cropped and perspective corrected Bitmap that only shows the menu part) OR you do need the entire scanner flow (including filter, auto-rotation, multi-page scanning and editing, export as PDF, etc, but just not the camera viewfinder)?

Does either work for you?

OR you actually strongly prefer one over another?

Thank you for your response! For our current needs, we require only cropping and perspective correction. It's essential that these features are integrated with the app's internal camera functionality. Once the cropping and correction are done, we will proceed with a Text Recognition kit for further processing.

SabrinaGeigerExxeta · Answer 3 · Fri Mar 08 2024 23:00:53 GMT+0800 (China Standard Time)

We would also like to have the possibility to use our camera and get the position of the document :)

Denis · Answer 4 · Mon Mar 11 2024 15:29:50 GMT+0800 (China Standard Time)

In our app we want to use the camera preview view with document detection, but we need to control the automatic / manual take picture mode and be able to trigger taking of the picture manually. Correction and crop of the taken picture we already implemented, but we lack the auto-rotation feature. So if it is possible to provide a part of the document scanner (like only camera preview), it would be great!