danpla / dpscreenocr

Program to recognize text on screen

Home Page:https://danpla.github.io/dpscreenocr/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Languages are not displayed after upgrading Ubuntu to 22.10 or later

brandones opened this issue · comments

image

~ $  apt list --installed 'tesseract-ocr-*'                             15:12:52
Listing... Done
tesseract-ocr-eng/lunar,lunar,now 1:4.1.0-2 all [installed]
tesseract-ocr-osd/lunar,lunar,now 1:4.1.0-2 all [installed,automatic]
tesseract-ocr-spa/lunar,lunar,now 1:4.1.0-2 all [installed]

Hi,

Could you please show the output of tesseract --list-langs? Also, did you restart dpScreenOCR after installing the languages?

~ $ tesseract --list-langs                                              18:47:55
List of available languages in "/usr/share/tesseract-ocr/5/tessdata/" (3):
eng
osd
spa

And yes, I did. Checked with

~ $ ps aux | grep ocr                                                   18:48:10
brandon   854893  0.0  0.0  14436  2408 pts/1    S+   18:48   0:00 grep --color=auto ocr
~ $                                                                     18:48:22

I tried both v1.3.0 and the current development version on Ubuntu 23.04 (Lunar Lobster), and both work without problems. But I built them manually as they are not available in PPA, so the question now is how dpScreenOCR was installed on your machine?

Yeah I have it installed via apt.

~ $ grep ^ /etc/apt/sources.list /etc/apt/sources.list.d/* | grep dpscreen                                                                     10:50:45
/etc/apt/sources.list.d/daniel_p-ubuntu-dpscreenocr-jammy.list:# deb https://ppa.launchpadcontent.net/daniel.p/dpscreenocr/ubuntu/ kinetic main # disabled on upgrade to kinetic
/etc/apt/sources.list.d/daniel_p-ubuntu-dpscreenocr-jammy.list:# deb-src https://ppa.launchpadcontent.net/daniel.p/dpscreenocr/ubuntu/ jammy main
/etc/apt/sources.list.d/daniel_p-ubuntu-dpscreenocr-jammy.list.distUpgrade:# deb https://ppa.launchpadcontent.net/daniel.p/dpscreenocr/ubuntu/ kinetic main # disabled on upgrade to kinetic
/etc/apt/sources.list.d/daniel_p-ubuntu-dpscreenocr-jammy.list.distUpgrade:# deb-src https://ppa.launchpadcontent.net/daniel.p/dpscreenocr/ubuntu/ jammy main
~ $ apt list dpscreenocr                                                10:47:53
Listing... Done
dpscreenocr/now 1.3.0-1~jammy1 amd64 [installed,local]

I'm on 23.04 (Lunar).

Thanks for your support with this. Other ideas for how to debug?

So the problem appeared after upgrading Ubuntu from 22.04 (Jammy) to 23.04, and the program worked fine on 22.04, right?

I guess I know what happened. Can you please show the output of ldd `which dpscreenocr` | grep tesseract?

Yeah, that's probably the case.

~ $ ldd $(which dpscreenocr) | grep tesseract                                                       17:45:29
        libtesseract.so.4 => /lib/x86_64-linux-gnu/libtesseract.so.4 (0x00007f589be00000)

Ubuntu 23.04 is shipped with Tesseract 5, while 22.04 uses Tesseract 4. During upgrade 22.04 to 23.04, Tesseract 4 was kept as the dependency of dpScreenOCR, but language packages were upgraded as they are not dependencies from the package manager's view. Tesseract 4 still tries to find them in /usr/share/tesseract-ocr/4.00/tessdata/, while they are now in /usr/share/tesseract-ocr/5/tessdata.

Long story short, the PPA now has a build for Lunar. I'm not sure if it will show up as a package upgrade since the package version is the same as in 22.04, so you may need to explicitly reinstall it, e.g.:

sudo apt update
sudo apt remove dpscreenocr
sudo apt install dpscreenocr

Had to re-add the PPA, but that did it! Thanks so much for your help @danpla . It's a great tool and I appreciate your work on it.

image

You're welcome!