google / XNNPACK

High-efficiency floating-point neural network inference operators for mobile, server, and Web

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Does XNNPACK support concurrent runs?

snnn opened this issue · comments

Like this: https://github.com/tensorflow/tensorflow/blob/v1.10.1/tensorflow/cc/tutorials/example_trainer.cc, which create a session from a model, then use multiple threads to invoke Session::Run() concurrently. The Session::Run() function is thread-safe. A session object can be shared by multiple threads.

From XNNPACK's API, it seems hard to achieve. For example, let's say we have a model with only one conv node. To run the conv node, we need to do 3 steps:

  1. Create an operator with weights: xnn_create_convolution2d_nhwc_f32
  2. Setup the operator with input/output buffers: xnn_setup_convolution2d_nhwc_f32
  3. Run the operator

Prepacking is in step 1. If we have multiple threads ,we will need multiple operators , therefore multiple copies of the weights. Is it the case?

Correct. Concurrent inferences on the same operator are not supported.