document scanner rotates the image when it recognizes the document

Question

document scanner rotates the image when it recognizes the document

Tavorc opened this issue 2 months ago · comments

When i'm using the ML kit for document scanner, most of the time(like 95%), the document that's recognized by the library it rotates the image.
it doesn't matter if i'm using the automatic mode or manual

any idea how to solve it?
Does it happen to someone?

listvin · Answer 1 · Wed Apr 03 2024 11:30:09 GMT+0800 (China Standard Time)

Does your unwanted rotation happen with LTR scripted docs? Are all of your docs receipts?

I suspect that MLKit's document scanner "UI flow" may be slightly tied with Text Recognition API, which in turn does not support Hebrew or any other RTL scripts. Even if it's irrelevant..

As you may have noticed in real life people who don't know Hebrew are trying to read documents written in Hebrew upside down. Idk about other rtl scripts irl, but I guess the fact that almost all Hebrew letters have the same height does not help at all.

These are just thots, I am not affiliated with Google in any way. I believe it's indeed a bug since API seems to be designed text-agnostic. Especially in manual mode.

If all of your data are receipts of similar format, probably you can postprocess them on low level or with tesseract

listvin · Answer 2 · Wed Apr 03 2024 11:41:57 GMT+0800 (China Standard Time)

Inspired by:

#784 (comment)

It seems that your Android language is English, can you try switching it to Hebrew?

Tavc · Answer 3 · Wed Apr 03 2024 16:49:17 GMT+0800 (China Standard Time)

first of all thank you.
Yes, all of the docs are receipts, it's fintech app.
I tried to change the language to Hebrew, doesn't work.

there is openCV library that i can use to cropping an image, but i didn't want it because the ML kit is more innovative.

steven · Answer 4 · Sat Apr 06 2024 05:43:15 GMT+0800 (China Standard Time)

Thanks for the feedback.

There is an auto-rotation step in the scanning flow. The intention is that when you hold the phone in parallel to the table, it may trigger the phone's and camera's auto rotation logic, and results in taking images with wrong orientation. However, apparently that text-based model doesn't work very well in this case.

What do you think would be a better behavior for you? Ideally, the model just handles everything. But if not the case, an option to turn on/off auto-rotation OR something else in your mind?

Tavc · Answer 5 · Sun Apr 07 2024 15:09:35 GMT+0800 (China Standard Time)

I think you can know what is the orientation of the device, for example in the camera there is label 1x that represent the zoom, when i rotate the device the "1X" will rotate also, so probably you can use this.