googlesamples / mlkit

A collection of sample apps to demonstrate how to use Google's ML Kit APIs on Android and iOS

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[Feature request] GmsDocumentScanner API using InputImage like other MLKit libraries.

zagum opened this issue · comments

What's your feature request? Please describe.

Hi! I'm working on a new feature for our app. I need to allow the user to take a photo of a menu using the camera inside the app (not with an external Camera Intent), then scan this photo to extract the menu with corrected perspective, and then proceed to the next steps. I tried your GmsDocumentScanner library; it does this work perfectly but only with an external camera, which does not work for our app.
It would be great if you could provide an API similar to your other libraries (e.g., TextRecognition) with an input parameter like InputImage and returning Task or something similar.

Mobile environment

Android

Additional context

Android Developer at Simple.Life Apps
https://play.google.com/store/apps/details?id=life.simple

Hi zagum,

Are you looking for an API to only do cropping with perspective correction (INPUT: photo with a menu and background, OUTPUT: cropped and perspective corrected Bitmap that only shows the menu part) OR you do need the entire scanner flow (including filter, auto-rotation, multi-page scanning and editing, export as PDF, etc, but just not the camera viewfinder)?

Does either work for you?

OR you actually strongly prefer one over another?

Hi zagum,

Are you looking for an API to only do cropping with perspective correction (INPUT: photo with a menu and background, OUTPUT: cropped and perspective corrected Bitmap that only shows the menu part) OR you do need the entire scanner flow (including filter, auto-rotation, multi-page scanning and editing, export as PDF, etc, but just not the camera viewfinder)?

Does either work for you?

OR you actually strongly prefer one over another?

Thank you for your response! For our current needs, we require only cropping and perspective correction. It's essential that these features are integrated with the app's internal camera functionality. Once the cropping and correction are done, we will proceed with a Text Recognition kit for further processing.

We would also like to have the possibility to use our camera and get the position of the document :)

In our app we want to use the camera preview view with document detection, but we need to control the automatic / manual take picture mode and be able to trigger taking of the picture manually. Correction and crop of the taken picture we already implemented, but we lack the auto-rotation feature. So if it is possible to provide a part of the document scanner (like only camera preview), it would be great!