Eigen::ThreadPoolInterface*, const onnxruntime::ThreadOptions&) pthread_setaffinity_np failed

Question

Eigen::ThreadPoolInterface*, const onnxruntime::ThreadOptions&) pthread_setaffinity_np failed

zyl112334 opened this issue 3 years ago · comments

Hi, i use onnxruntime to infer, but program error. How can i solve this problem? Thanks!

System information
Linux Ubuntu 16.04
python3.6.5
onnxruntime 1.8.0
only cpu(4 cores), and ONNX Runtime installed from pip.

File "/home/admin/qiyun/target/qiyun/tools/infer/utility.py", line 104, in create_predictor
sess = ort.InferenceSession(model_file_path)
File "/home/admin/.local/lib/python3.6/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 283, in init
self._create_inference_session(providers, provider_options, disabled_optimizers)
File "/home/admin/.local/lib/python3.6/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 310, in _create_inference_session
sess = C.InferenceSession(session_options, self._model_path, True, self._read_config_from_model)
RuntimeError: /onnxruntime_src/onnxruntime/core/platform/posix/env.cc:142 onnxruntime::{anonymous}::PosixThread::PosixThread(const char*, int, unsigned int ()(int, Eigen::ThreadPoolInterface),Eigen::ThreadPoolInterface*, const onnxruntime::ThreadOptions&) pthread_setaffinity_np failed

Changming Sun · Answer 1 · Thu Jul 08 2021 00:47:09 GMT+0800 (China Standard Time)

I believe you have used cpuset at the same time?

yilin · Answer 2 · Thu Jul 08 2021 11:23:07 GMT+0800 (China Standard Time)

I solve this problem with seting "options = ort.SessionOptions() options.intra_op_num_threads = 1 options.inter_op_num_threads = 1"(the defalut value for those params is 0), how can i understand this condition?

NCEPUYTL · Answer 3 · Thu Jul 08 2021 17:35:59 GMT+0800 (China Standard Time)

I meet the same error, by setting "options = ort.SessionOptions() options.intra_op_num_threads = 1 options.inter_op_num_threads = 1" ,but the inference speed is slow，How can I still inference using CPU under GPU environment

Changming Sun · Answer 4 · Fri Jul 09 2021 04:42:07 GMT+0800 (China Standard Time)

What if you set intra_op_num_threads to the number of your CPU cores?

NCEPUYTL · Answer 5 · Fri Jul 09 2021 11:28:16 GMT+0800 (China Standard Time)

What if you set intra_op_num_threads to the number of your CPU cores?

slower if I set intra_op_num_threads to the number of my CPU cores，So How can I infer only use CPU under GPU environment,thanks!

Changming Sun · Answer 6 · Tue Jul 13 2021 00:29:34 GMT+0800 (China Standard Time)

How can I infer only use CPU under GPU

You can use the cpu only package: https://pypi.org/project/onnxruntime/ instead of https://pypi.org/project/onnxruntime-gpu/ .

ruike zhu · Answer 7 · Fri Feb 18 2022 05:52:46 GMT+0800 (China Standard Time)

Hi, I also met the same problem. And I want to use GPU to do the onnx inference, I tried 'options = ort.SessionOptions() options.intra_op_num_threads = 1 options.inter_op_num_threads = 1', but the error become 'segmentation fault', I wonder is there any other solutions to solve this problem?

my environment:
python 3.6.13
onnx 1.10.2
onnxruntime-gpu 1.10.0
torch 1.10.2
torchaudio 0.10.2
torchvision 0.11.3
OS x86_64 GNU/Linux
GCC version Ubuntu 9.3.0
CUDA 11.4
GPU type A100
Driver Version 470.82.01

Kyle Gerard Felker · Answer 8 · Sat Feb 26 2022 10:31:42 GMT+0800 (China Standard Time)

@snnn just to provide more context to @poem2018 's comment: our onnxruntime-gpu installation on a shared DGX-A100 machine (8x GPUs, 2x AMD CPUs per node) works totally fine when an entire dedicated node is used.

We encounter seg-faults / core dumps / the above exception when it is run on a shared node allocation, where each user is given a dedicated single GPU on the node and shares a fraction of the cores with another user controlled via cpusets which lock user sessions to gpu-affine cores, e.g.

cat /sys/fs/cgroup/cpuset/single-gpu/gpu0/cpuset.cpus
48-63,176-191

Within that cpuset, you have to share cycles with another user on the paired GPU, if it is in use. cgroup fair scheduling is used for that.

I dont believe we had issues with earlier versions of ORT using cpuset, but I would need to recheck it. And as @poem2018 indicated, setting the num threads to 1 does not avoid the issue. So not clear if #10122 would fix this.

#10113 (comment) is there a way to bind specific core affinity?

Changming Sun · Answer 9 · Sat Feb 26 2022 12:28:10 GMT+0800 (China Standard Time)

By default, ONNX Runtime tried to bind each thread to a logical CPU if the user didn't explicitly set intra_op_num_threads. As you see, it is causing problems. So I'd prefer to not doing the binding. And if you have the need to setup thread affinity through ONNX Runtime API, we can design one and add it to onnxruntime_c_api.h. ONNX Runtime is an open source project, if you already have a design in mind, welcome to let us know.

baoachun · Answer 10 · Thu Jul 07 2022 10:06:00 GMT+0800 (China Standard Time)

Any progress？I had the same problem with 1.10.1 cpu version.

Mohsen Mahmoodzadeh · Answer 11 · Sat Nov 12 2022 14:42:18 GMT+0800 (China Standard Time)

@snnn

Suppose we set intra_op_num_thread on a specific integer or cpu_count(logical=True).

Then we create an image from our project(with onnx) and setup a container. If we constrain cpu cores for the container, what if this number is fewer than set intra_op_num_thread parameter?

Xiang Lyu · Answer 12 · Wed Nov 16 2022 13:02:43 GMT+0800 (China Standard Time)

By default, ONNX Runtime tried to bind each thread to a logical CPU if the user didn't explicitly set intra_op_num_threads. As you see, it is causing problems. So I'd prefer to not doing the binding. And if you have the need to setup thread affinity through ONNX Runtime API, we can design one and add it to onnxruntime_c_api.h. ONNX Runtime is an open source project, if you already have a design in mind, welcome to let us know.

I am using nvidia triton with onnxruntime backend. When I try to run triton with k8s deployment, I ran into same pthread_setaffinity_np failed problem. Because the triton is already compiled and it does not provide method to set intra_op_num_thread, I wonder if there is any envorionment variable for onnx to specify intra_op_num_thread?