kermitt2 / grobid

A machine learning software for extracting information from scholarly documents

Home Page:https://grobid.readthedocs.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Help with Mac M1 arm64 docker? (Tensorflow AXS)

kochbj opened this issue · comments

Hi,

First, thanks for putting together such amazing software!

I'm trying to run the most recent dockers on my M1 Max Mac and am getting the "TensorFlow library was compiled to use AVX instructions, but these aren't available on your machine." exit error. I see the guidance in the documentation, but I am too inexperienced with Docker to know where I am supposed to make changes to the CPU configuration. I found this image that someone created that runs https://hub.docker.com/r/kurdidev/grobid, but would love to run the most recent version. Can you give me some basic orientation on what I need to change?

Thanks again,
Bernard

Hi @kochbj,
As far as I know, it seems that your error is probably related more to the docker configuration than to the grobid configuration. Have you tried to install tensorflow following their official guide? Here, for example: https://www.tensorflow.org/install/docker

The images in https://hub.docker.com/r/kurdidev/grobid do not use TensorFlow (https://grobid.readthedocs.io/en/latest/Grobid-docker/).
Some time ago, I built a lightweight docker image for arm, however, I was still not sure it could work well with spawning new processes, you can try it here: lfoppiano/grobid:0.8.0-arm

You can see additional information in the previous discussion here: #1014

Hi @lfoppiano, really appreciate the quick reply. So the best idea is to essentially install from source on a TensorFlow docker image? Also can you point me to your arm image? It doesn't seem to be on the hub anymore.

Thanks a bunch,
B

@kochbj I meant that you might want to test first that the plain tensorflow image can work and that you can access your GPU correctly, before trying to use grobid.

The image I refer to is here: https://hub.docker.com/layers/lfoppiano/grobid/0.8.0-arm/images/sha256-79b85da73bae5c2a483e381c1e1231bc73dc0d6b987f16b867a3eb6e8154d7b8?context=explore

Got it! @lfoppiano, one more question since it would be helpful to know (and maybe for the documentation): which extraction functions use neural models instead of CRF? This would inform how much work I put into figuring this out. :)

Thanks again,
B

Here there is a list of models that are recommended to be used with deep learning rather than CRF because they work substantially better.

In addition, Fulltext and (segmentation?, not 100% sure I remember well) are not available as DL.

Ahah perfect. When all else fails, RTFM. :) Thanks so much for your help on this. Closing the issue now!

For info, segmentation model is available in DL too, but not better than CRF in the current state - so not recommended because a bit slower and memory hungry.

@kochbj Did you manage to get a docker image up and running?

So sorry I missed this! I never got docker to run on my Mac M1. :( I did get GROBID up on a pop.os system with Nvidia 200 series cards, but the image seems to hang when loading models; not sure why. My last shot will be to try it with virtualization on a cluster with cards with a bit more memory.

@kochbj I have an M1 and docker works fine there. What problems do you have?
As for Grobid, there is an image 0.8.0-arm which should work, although it might be not so stable.

@lfoppiano Could you share the image url?