aarch64: libgomp.so.1: cannot allocate memory in static TLS block

Question

aarch64: libgomp.so.1: cannot allocate memory in static TLS block

ric96 opened this issue 5 years ago · comments

Problem is caused by this optional flag: -DWITH_OPENMP=ON
Comment with workaround: #14884 (comment)
Comment with investigation: #14884 (comment)

Original description

I am compiling opencv from source in order to enable multi-core support on a aarch64 platform.

I am building open cv with the following config:

cmake -D CMAKE_BUILD_TYPE=RELEASE -DBUILD_SHARED_LIBS=OFF -DBUILD_EXAMPLES=OFF -DBUILD_opencv_apps=OFF -DBUILD_DOCS=OFF -DBUILD_PERF_TESTS=OFF -DBUILD_TESTS=OFF -DCMAKE_INSTALL_PREFIX=/usr/local -DENABLE_PRECOMPILED_HEADERS=OFF -DWITH_LIBV4L=ON -DWITH_QT=ON -DWITH_OPENGL=ON -DFORCE_VTK=ON -DWITH_TBB=ON -DWITH_GDAL=ON -DWITH_XINE=ON -DWITH_OPENMP=ON -DWITH_GSTREAMER=ON -DWITH_OPENCL=ON -DOPENCV_EXTRA_MODULES_PATH=../../opencv_contrib/modules ../

but every time I import it using python, I get the following error. A lot of the results on-line seems to be very vague and/or point towards a bug in gcc 4.2 fixed in 4.3 and I am currently on 8.x provided by debian repo and also 9.0.1 built from upstream source:

root@linaro-developer:~# python3
Python 3.7.3 (default, Apr  3 2019, 05:39:12) 
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import cv2
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.7/dist-packages/cv2/__init__.py", line 89, in <module>
    bootstrap()
  File "/usr/local/lib/python3.7/dist-packages/cv2/__init__.py", line 79, in bootstrap
    import cv2
ImportError: /usr/lib/aarch64-linux-gnu/libgomp.so.1: cannot allocate memory in static TLS block

This is on opencv master branch as well as 4.0.1

Sahaj Sarup · Answer 1 · Sun Jul 07 2019 11:03:12 GMT+0800 (China Standard Time)

I can now confirm that OpenCV 3.2 built with openmp is working fine on the exact same platform.

Maksim Shabunin · Answer 2 · Tue Jul 09 2019 05:24:52 GMT+0800 (China Standard Time)

I just wanted to point that it is not necessary to enable TBB or OpenMP in order to enable multicore support. Default pthreads backend can be good enough for most situations.

jmliu1983 · Answer 3 · Mon Aug 12 2019 10:22:26 GMT+0800 (China Standard Time)

I have the same issue.
Hardware: Jetson Xavier
IDE: Spyder 3.2.6
Python 3.6.8
It's strange that this happened only in Spyder 3.2.6, when I import cv2 in consol, it works well.

Thomas Wimmer · Answer 4 · Mon Sep 16 2019 20:44:47 GMT+0800 (China Standard Time)

See pytorch/pytorch#2575

Ivan Farkas · Answer 5 · Sun Oct 20 2019 04:40:44 GMT+0800 (China Standard Time)

@ric96 Sahaj, did @thisch suggestion worked for you?

matallan · Answer 6 · Tue Mar 17 2020 10:58:56 GMT+0800 (China Standard Time)

I had a similar issue where I could import cv2 in consol but not in a script. I was importing tensorflow before cv2. For some reason switching the order (importing cv2 first) worked. Not sure why - but if it helps someone else 👍

Ahmed Hisham · Answer 7 · Wed Mar 18 2020 00:01:10 GMT+0800 (China Standard Time)

I had a similar issue where I could import cv2 in consol but not in a script. I was importing tensorflow before cv2. For some reason switching the order (importing cv2 first) worked. Not sure why - but if it helps someone else 👍

@matallan

I had the same problem on Nvidia Jetson Nano device.
Switching the import order has worked for me too.
Seems to be a problem with the embedded device itself. Perhaps because I'm not running it on the 10W power mode.

Talha Çelik · Answer 8 · Mon Apr 20 2020 20:55:23 GMT+0800 (China Standard Time)

I had a similar issue where I could import cv2 in consol but not in a script. I was importing tensorflow before cv2. For some reason switching the order (importing cv2 first) worked. Not sure why - but if it helps someone else 👍

it's working to me. Thanks

Kroj-0 · Answer 9 · Wed May 13 2020 21:27:19 GMT+0800 (China Standard Time)

I had a similar issue where I could import cv2 in consol but not in a script. I was importing tensorflow before cv2. For some reason switching the order (importing cv2 first) worked. Not sure why - but if it helps someone else 👍

no idea why, fixed it for me too

wangrui · Answer 10 · Wed Aug 26 2020 17:52:27 GMT+0800 (China Standard Time)

keras-team/keras-tuner#317

PetrDvoracek · Answer 11 · Sun Oct 11 2020 23:55:39 GMT+0800 (China Standard Time)

have you tried export LD_PRELOAD=/usr/lib/aarch64-linux-gnu/libgomp.so.1 ?

Emre Çetin · Answer 12 · Wed Feb 24 2021 14:58:43 GMT+0800 (China Standard Time)

I had a similar issue where I could import cv2 in consol but not in a script. I was importing tensorflow before cv2. For some reason switching the order (importing cv2 first) worked. Not sure why - but if it helps someone else

It is working to me. Thank you

Q-engineering · Answer 13 · Thu Apr 08 2021 18:05:24 GMT+0800 (China Standard Time)

Most likely cause here

That is a bug in whatever dlopened libgomp.so.1 and before doing that ate all the preallocated TLS area.
The GNU TLS2 model which I'm afraid aarch64 uses unfortunately eats from the same TLS preallocated pool as libraries that require static TLS like libgomp, where it is performance critical to have it as static TLS.
Either don't dlopen libgomp, or LD_PRELOAD it, link it with the application that dlopens it, cut down the uses of other TLS or dlopen libgomp earlier.
There is nothing that can be done on the gcc side.

Three possible solutions:

Disable the use of OpenMP during installation of OpenCV. Place the arguments -DBUILD_OPENMP=OFF -DWITH_OPENMP=OFF in your CMAKE command line. OpenCV will now use pthread. Performance losses are hardly noticeable. (Of course, only possible if you install OpenCV from scratch and not by pip or sudo apt install)
Use TBB as parallel framework. sudo apt-get install libtbb2 libtbb-dev loads the needed libraries.
-DBUILD_TBB=ON -DWITH_TBB=ON in your CMAKE command. TBB overrules OpenMP.
(Again, only possible if you install OpenCV from scratch)
As stated above, make sure another application uses libgomp before OpenCV. This can be the import of TensorFlow, or the LD_PRELOAD construction.

AlexandreBourrieau · Answer 14 · Sun Aug 29 2021 22:39:53 GMT+0800 (China Standard Time)

Hello,

No solution abve work for me... In jupyter I got the same error while importing CV2 and it works under python shell.

I try to import tensorflwo first, try to preload the library...

Do you have another solution please ?? Thanks

yipenglinoe · Answer 15 · Tue Sep 14 2021 19:34:09 GMT+0800 (China Standard Time)

Hello,

No solution abve work for me... In jupyter I got the same error while importing CV2 and it works under python shell.

I try to import tensorflwo first, try to preload the library...

Do you have another solution please ?? Thanks

I have the same issues as you , export LD_PRELOAD=/usr/lib/aarch64-linux-gnu/libgomp.so.1 dosen't work in Jupyter , it is so werid .

chitoku · Answer 16 · Fri Sep 17 2021 14:16:29 GMT+0800 (China Standard Time)

Hi @AlexandreBourrieau @yipenglinoe,
If you are using Jupyter Lab, you may want to add the following in your jupyter_lab_config.py.

import os
c = get_config()
os.environ['LD_PRELOAD'] = '/usr/lib/aarch64-linux-gnu/libgomp.so.1'
c.Spawner.env.update('LD_PRELOAD')

Luka Trikha · Answer 17 · Mon Dec 06 2021 09:32:30 GMT+0800 (China Standard Time)

@chitoku For some reason, this is still a problem, and this was the only solution that helped me. I did not realize it was a jupyterlab problem until I tried this answer. Thanks, this helped my issue!

fei4xu · Answer 18 · Sat Apr 30 2022 23:09:43 GMT+0800 (China Standard Time)

As stated above, make sure another application uses libgomp before OpenCV. This can be the import of TensorFlow, or the LD_PRELOAD construction.

in my case, it is:

 export LD_PRELOAD=/home/f/.local/lib/python3.6/site-packages/sklearn/__check_build/../../scikit_learn.libs/libgomp-d22c30c5.so.1.0.0

I find the full path according to the error message like this:

ImportError: /home/f/.local/lib/python3.6/site-packages/sklearn/__check_build/../../scikit_learn.libs/libgomp-d22c30c5.so.1.0.0: cannot allocate memory in static TLS block

Heather Lapointe · Answer 19 · Mon Jun 06 2022 04:42:15 GMT+0800 (China Standard Time)

Ran into this recently, and since I really don't like the messy "change import order" approach, nor do I want to rely on preloads in my environment, I used a more explicit approach:

if platform.system() == "Linux":
     # Force libgomp to be loaded before other libraries consuming dynamic TLS (to avoid running out of STATIC_TLS)
     ctypes.cdll.LoadLibrary("libgomp.so.1")

Perry Gibson · Answer 20 · Wed Jul 06 2022 20:52:02 GMT+0800 (China Standard Time)

I experienced this issue, and tried the above solutions. However on closer inspection, I saw that the library causing the issue was using its own bundled version of libgomp, not the system one.

I.e.:

Error message(s): ['/usr/local/lib/python3.6/dist-packages/xgboost/lib/../../xgboost.libs/libgomp-d22c30c5.so.1.0.0: cannot allocate memory in static TLS block']

Therefore when running the export LD_PRELOAD solution, I had to set the path to that bundled library, not my system one.

Adarsh Ghimire · Answer 21 · Mon Dec 05 2022 19:16:08 GMT+0800 (China Standard Time)

Importing scikit-learn related library first before importing any other libraries like opencv fixed the issue for me.
For example, albumentations include the scikit learn imports, so I imported that first before importing opencv.

from albumentations import Resize, Compose
from albumentations.pytorch.transforms import  ToTensor
from albumentations.augmentations.transforms import Normalize
import cv2

Kevin Eales · Answer 22 · Thu Feb 02 2023 00:14:51 GMT+0800 (China Standard Time)

I have experienced this problem on systems that have OpenMP as a prerequisite that are built with cuda/NVPTX extensions such as LLVM.

importing cv2 first then scikit-learn results in:
ImportError: /opt/venv/lib/python3.10/site-packages/sklearn/__check_build/../../scikit_learn.libs/libgomp-d22c30c5.so.1.0.0
importing scikit-learn then cv2 results in:
ImportError: /lib/aarch64-linux-gnu/libgomp.so.1: cannot allocate memory in static TLS block.

Being that these two libraries are interchangeable this issue can be resolved by using a static link:

cd /lib/pythonXXX/site-packages/scikit-learn.libs/
mv libgomp-d22c30c5.so.1.0.0 libgomp-d22c30c5.so.1.0.0_bak
ln -s /usr/lib/-linux-gnu/libgomp.so.1 libgomp-d22c30c5.so.1.0.0

Grégory Starck · Answer 23 · Sat May 06 2023 22:19:34 GMT+0800 (China Standard Time)

1. Disable the use of OpenMP during installation of OpenCV. Place the arguments -DBUILD_OPENMP=OFF -DWITH_OPENMP=OFF in your CMAKE command line. OpenCV will now use pthread. Performance losses are hardly noticeable. (Of course, only possible if you install OpenCV from scratch and not by pip or sudo apt install)

2. Use TBB as parallel framework. sudo apt-get install libtbb2 libtbb-dev loads the needed libraries.
-DBUILD_TBB=ON -DWITH_TBB=ON in your CMAKE command. TBB overrules OpenMP.
(Again, only possible if you install OpenCV from scratch)

fwiw, I tried this solution (1, 2, or 1+2), but I still get :

    import cv2
E   ImportError: /lib/aarch64-linux-gnu/libGLdispatch.so.0: cannot allocate memory in static TLS block

I have well Parallel framework: TBB (ver 2020.2 interface 11102) though thus.

it's with Jetson NX JP5.1 / ubuntu 20.04 / python3.8 / opencv 4.7.0 for any info.

doing:

ctypes.cdll.LoadLibrary(_GL_DISPATCH_PATH)

before import of cv2 (or in its __init__.py (or related) given I build the wheel I'm using too (from the same custom build than the shared binary libs themselves of opencv)), work-around correctly the issue though still.