opencv / opencv

Open Source Computer Vision Library

Home Page:https://opencv.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

aarch64: libgomp.so.1: cannot allocate memory in static TLS block

ric96 opened this issue · comments

Problem is caused by this optional flag: -DWITH_OPENMP=ON
Comment with workaround: #14884 (comment)
Comment with investigation: #14884 (comment)


Original description

I am compiling opencv from source in order to enable multi-core support on a aarch64 platform.

I am building open cv with the following config:

cmake -D CMAKE_BUILD_TYPE=RELEASE -DBUILD_SHARED_LIBS=OFF -DBUILD_EXAMPLES=OFF -DBUILD_opencv_apps=OFF -DBUILD_DOCS=OFF -DBUILD_PERF_TESTS=OFF -DBUILD_TESTS=OFF -DCMAKE_INSTALL_PREFIX=/usr/local -DENABLE_PRECOMPILED_HEADERS=OFF -DWITH_LIBV4L=ON -DWITH_QT=ON -DWITH_OPENGL=ON -DFORCE_VTK=ON -DWITH_TBB=ON -DWITH_GDAL=ON -DWITH_XINE=ON -DWITH_OPENMP=ON -DWITH_GSTREAMER=ON -DWITH_OPENCL=ON -DOPENCV_EXTRA_MODULES_PATH=../../opencv_contrib/modules ../

but every time I import it using python, I get the following error. A lot of the results on-line seems to be very vague and/or point towards a bug in gcc 4.2 fixed in 4.3 and I am currently on 8.x provided by debian repo and also 9.0.1 built from upstream source:

root@linaro-developer:~# python3
Python 3.7.3 (default, Apr  3 2019, 05:39:12) 
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import cv2
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.7/dist-packages/cv2/__init__.py", line 89, in <module>
    bootstrap()
  File "/usr/local/lib/python3.7/dist-packages/cv2/__init__.py", line 79, in bootstrap
    import cv2
ImportError: /usr/lib/aarch64-linux-gnu/libgomp.so.1: cannot allocate memory in static TLS block

This is on opencv master branch as well as 4.0.1

I can now confirm that OpenCV 3.2 built with openmp is working fine on the exact same platform.

I just wanted to point that it is not necessary to enable TBB or OpenMP in order to enable multicore support. Default pthreads backend can be good enough for most situations.

I have the same issue.
Hardware: Jetson Xavier
IDE: Spyder 3.2.6
Python 3.6.8
It's strange that this happened only in Spyder 3.2.6, when I import cv2 in consol, it works well.

@ric96 Sahaj, did @thisch suggestion worked for you?

I had a similar issue where I could import cv2 in consol but not in a script. I was importing tensorflow before cv2. For some reason switching the order (importing cv2 first) worked. Not sure why - but if it helps someone else 👍

I had a similar issue where I could import cv2 in consol but not in a script. I was importing tensorflow before cv2. For some reason switching the order (importing cv2 first) worked. Not sure why - but if it helps someone else 👍

@matallan

I had the same problem on Nvidia Jetson Nano device.
Switching the import order has worked for me too.
Seems to be a problem with the embedded device itself. Perhaps because I'm not running it on the 10W power mode.

I had a similar issue where I could import cv2 in consol but not in a script. I was importing tensorflow before cv2. For some reason switching the order (importing cv2 first) worked. Not sure why - but if it helps someone else 👍

it's working to me. Thanks

I had a similar issue where I could import cv2 in consol but not in a script. I was importing tensorflow before cv2. For some reason switching the order (importing cv2 first) worked. Not sure why - but if it helps someone else 👍

no idea why, fixed it for me too

have you tried export LD_PRELOAD=/usr/lib/aarch64-linux-gnu/libgomp.so.1 ?

I had a similar issue where I could import cv2 in consol but not in a script. I was importing tensorflow before cv2. For some reason switching the order (importing cv2 first) worked. Not sure why - but if it helps someone else

It is working to me. Thank you

Most likely cause here

That is a bug in whatever dlopened libgomp.so.1 and before doing that ate all the preallocated TLS area.
The GNU TLS2 model which I'm afraid aarch64 uses unfortunately eats from the same TLS preallocated pool as libraries that require static TLS like libgomp, where it is performance critical to have it as static TLS.
Either don't dlopen libgomp, or LD_PRELOAD it, link it with the application that dlopens it, cut down the uses of other TLS or dlopen libgomp earlier.
There is nothing that can be done on the gcc side.

Three possible solutions:

  1. Disable the use of OpenMP during installation of OpenCV. Place the arguments -DBUILD_OPENMP=OFF -DWITH_OPENMP=OFF in your CMAKE command line. OpenCV will now use pthread. Performance losses are hardly noticeable. (Of course, only possible if you install OpenCV from scratch and not by pip or sudo apt install)

  2. Use TBB as parallel framework. sudo apt-get install libtbb2 libtbb-dev loads the needed libraries.
    -DBUILD_TBB=ON -DWITH_TBB=ON in your CMAKE command. TBB overrules OpenMP.
    (Again, only possible if you install OpenCV from scratch)

  3. As stated above, make sure another application uses libgomp before OpenCV. This can be the import of TensorFlow, or the LD_PRELOAD construction.

Hello,

No solution abve work for me... In jupyter I got the same error while importing CV2 and it works under python shell.

I try to import tensorflwo first, try to preload the library...

Do you have another solution please ?? Thanks

Hello,

No solution abve work for me... In jupyter I got the same error while importing CV2 and it works under python shell.

I try to import tensorflwo first, try to preload the library...

Do you have another solution please ?? Thanks

I have the same issues as you , export LD_PRELOAD=/usr/lib/aarch64-linux-gnu/libgomp.so.1 dosen't work in Jupyter , it is so werid .

Hi @AlexandreBourrieau @yipenglinoe,
If you are using Jupyter Lab, you may want to add the following in your jupyter_lab_config.py.

import os
c = get_config()
os.environ['LD_PRELOAD'] = '/usr/lib/aarch64-linux-gnu/libgomp.so.1'
c.Spawner.env.update('LD_PRELOAD')

@chitoku For some reason, this is still a problem, and this was the only solution that helped me. I did not realize it was a jupyterlab problem until I tried this answer. Thanks, this helped my issue!

  1. As stated above, make sure another application uses libgomp before OpenCV. This can be the import of TensorFlow, or the LD_PRELOAD construction.

in my case, it is:

 export LD_PRELOAD=/home/f/.local/lib/python3.6/site-packages/sklearn/__check_build/../../scikit_learn.libs/libgomp-d22c30c5.so.1.0.0

I find the full path according to the error message like this:

ImportError: /home/f/.local/lib/python3.6/site-packages/sklearn/__check_build/../../scikit_learn.libs/libgomp-d22c30c5.so.1.0.0: cannot allocate memory in static TLS block

Ran into this recently, and since I really don't like the messy "change import order" approach, nor do I want to rely on preloads in my environment, I used a more explicit approach:

if platform.system() == "Linux":
     # Force libgomp to be loaded before other libraries consuming dynamic TLS (to avoid running out of STATIC_TLS)
     ctypes.cdll.LoadLibrary("libgomp.so.1")

I experienced this issue, and tried the above solutions. However on closer inspection, I saw that the library causing the issue was using its own bundled version of libgomp, not the system one.

I.e.:

Error message(s): ['/usr/local/lib/python3.6/dist-packages/xgboost/lib/../../xgboost.libs/libgomp-d22c30c5.so.1.0.0: cannot allocate memory in static TLS block']

Therefore when running the export LD_PRELOAD solution, I had to set the path to that bundled library, not my system one.

Importing scikit-learn related library first before importing any other libraries like opencv fixed the issue for me.
For example, albumentations include the scikit learn imports, so I imported that first before importing opencv.

from albumentations import Resize, Compose
from albumentations.pytorch.transforms import  ToTensor
from albumentations.augmentations.transforms import Normalize
import cv2

I have experienced this problem on systems that have OpenMP as a prerequisite that are built with cuda/NVPTX extensions such as LLVM.

importing cv2 first then scikit-learn results in:
ImportError: /opt/venv/lib/python3.10/site-packages/sklearn/__check_build/../../scikit_learn.libs/libgomp-d22c30c5.so.1.0.0
importing scikit-learn then cv2 results in:
ImportError: /lib/aarch64-linux-gnu/libgomp.so.1: cannot allocate memory in static TLS block.

Being that these two libraries are interchangeable this issue can be resolved by using a static link:

cd /lib/pythonXXX/site-packages/scikit-learn.libs/
mv libgomp-d22c30c5.so.1.0.0 libgomp-d22c30c5.so.1.0.0_bak
ln -s /usr/lib/-linux-gnu/libgomp.so.1 libgomp-d22c30c5.so.1.0.0

1. Disable the use of OpenMP during installation of OpenCV. Place the arguments -DBUILD_OPENMP=OFF -DWITH_OPENMP=OFF in your CMAKE command line. OpenCV will now use pthread. Performance losses are hardly noticeable. (Of course, only possible if you install OpenCV from scratch and not by pip or sudo apt install)

2. Use TBB as parallel framework. sudo apt-get install libtbb2 libtbb-dev loads the needed libraries.
-DBUILD_TBB=ON -DWITH_TBB=ON in your CMAKE command. TBB overrules OpenMP.
(Again, only possible if you install OpenCV from scratch)

fwiw, I tried this solution (1, 2, or 1+2), but I still get :

    import cv2
E   ImportError: /lib/aarch64-linux-gnu/libGLdispatch.so.0: cannot allocate memory in static TLS block

I have well Parallel framework: TBB (ver 2020.2 interface 11102) though thus.

it's with Jetson NX JP5.1 / ubuntu 20.04 / python3.8 / opencv 4.7.0 for any info.

doing:

ctypes.cdll.LoadLibrary(_GL_DISPATCH_PATH)

before import of cv2 (or in its __init__.py (or related) given I build the wheel I'm using too (from the same custom build than the shared binary libs themselves of opencv)), work-around correctly the issue though still.