microsoft / onnxruntime

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator

Home Page:https://onnxruntime.ai

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Eigen::ThreadPoolInterface*, const onnxruntime::ThreadOptions&) pthread_setaffinity_np failed

zyl112334 opened this issue · comments

commented

Hi, i use onnxruntime to infer, but program error. How can i solve this problem? Thanks!

System information
Linux Ubuntu 16.04
python3.6.5
onnxruntime 1.8.0
only cpu(4 cores), and ONNX Runtime installed from pip.

File "/home/admin/qiyun/target/qiyun/tools/infer/utility.py", line 104, in create_predictor
sess = ort.InferenceSession(model_file_path)
File "/home/admin/.local/lib/python3.6/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 283, in init
self._create_inference_session(providers, provider_options, disabled_optimizers)
File "/home/admin/.local/lib/python3.6/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 310, in _create_inference_session
sess = C.InferenceSession(session_options, self._model_path, True, self._read_config_from_model)
RuntimeError: /onnxruntime_src/onnxruntime/core/platform/posix/env.cc:142 onnxruntime::{anonymous}::PosixThread::PosixThread(const char*, int, unsigned int ()(int, Eigen::ThreadPoolInterface),Eigen::ThreadPoolInterface*, const onnxruntime::ThreadOptions&) pthread_setaffinity_np failed

I believe you have used cpuset at the same time?

commented

I solve this problem with seting "options = ort.SessionOptions() options.intra_op_num_threads = 1 options.inter_op_num_threads = 1"(the defalut value for those params is 0), how can i understand this condition?

I meet the same error, by setting "options = ort.SessionOptions() options.intra_op_num_threads = 1 options.inter_op_num_threads = 1" ,but the inference speed is slow,How can I still inference using CPU under GPU environment

What if you set intra_op_num_threads to the number of your CPU cores?

What if you set intra_op_num_threads to the number of your CPU cores?

slower if I set intra_op_num_threads to the number of my CPU cores,So How can I infer only use CPU under GPU environment,thanks!

How can I infer only use CPU under GPU

You can use the cpu only package: https://pypi.org/project/onnxruntime/ instead of https://pypi.org/project/onnxruntime-gpu/ .

Hi, I also met the same problem. And I want to use GPU to do the onnx inference, I tried 'options = ort.SessionOptions() options.intra_op_num_threads = 1 options.inter_op_num_threads = 1', but the error become 'segmentation fault', I wonder is there any other solutions to solve this problem?

my environment:
python 3.6.13
onnx 1.10.2
onnxruntime-gpu 1.10.0
torch 1.10.2
torchaudio 0.10.2
torchvision 0.11.3
OS x86_64 GNU/Linux
GCC version Ubuntu 9.3.0
CUDA 11.4
GPU type A100
Driver Version 470.82.01

@snnn just to provide more context to @poem2018 's comment: our onnxruntime-gpu installation on a shared DGX-A100 machine (8x GPUs, 2x AMD CPUs per node) works totally fine when an entire dedicated node is used.

We encounter seg-faults / core dumps / the above exception when it is run on a shared node allocation, where each user is given a dedicated single GPU on the node and shares a fraction of the cores with another user controlled via cpusets which lock user sessions to gpu-affine cores, e.g.

cat /sys/fs/cgroup/cpuset/single-gpu/gpu0/cpuset.cpus
48-63,176-191

Within that cpuset, you have to share cycles with another user on the paired GPU, if it is in use. cgroup fair scheduling is used for that.

I dont believe we had issues with earlier versions of ORT using cpuset, but I would need to recheck it. And as @poem2018 indicated, setting the num threads to 1 does not avoid the issue. So not clear if #10122 would fix this.

#10113 (comment) is there a way to bind specific core affinity?

By default, ONNX Runtime tried to bind each thread to a logical CPU if the user didn't explicitly set intra_op_num_threads. As you see, it is causing problems. So I'd prefer to not doing the binding. And if you have the need to setup thread affinity through ONNX Runtime API, we can design one and add it to onnxruntime_c_api.h. ONNX Runtime is an open source project, if you already have a design in mind, welcome to let us know.

Any progress?I had the same problem with 1.10.1 cpu version.

@snnn

Suppose we set intra_op_num_thread on a specific integer or cpu_count(logical=True).

Then we create an image from our project(with onnx) and setup a container. If we constrain cpu cores for the container, what if this number is fewer than set intra_op_num_thread parameter?

By default, ONNX Runtime tried to bind each thread to a logical CPU if the user didn't explicitly set intra_op_num_threads. As you see, it is causing problems. So I'd prefer to not doing the binding. And if you have the need to setup thread affinity through ONNX Runtime API, we can design one and add it to onnxruntime_c_api.h. ONNX Runtime is an open source project, if you already have a design in mind, welcome to let us know.

I am using nvidia triton with onnxruntime backend. When I try to run triton with k8s deployment, I ran into same pthread_setaffinity_np failed problem. Because the triton is already compiled and it does not provide method to set intra_op_num_thread, I wonder if there is any envorionment variable for onnx to specify intra_op_num_thread?