google / gemmlowp

Low-precision matrix multiplication

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Mixing openmp with gemmlowp multithreading causes low performance

liyinhgqw opened this issue · comments

If I run a loop with multi threads using openmp, and then call gemmlowp, the performance of gemm will be affected. Any clue?

e.g.

  #pragma omp parallel for
  for (int i = 0; i < 100; ++i) {
  }

  gemmlowp::GemmContext gemm_context;
  gemm_context.set_max_num_threads(4);
  using BitDepthParams = gemmlowp::L8R8WithLhsNonzeroBitDepthParams;
  while (iters--) {
    gemmlowp::GemmWithOutputPipeline<std::uint8_t, std::int32_t,
                                     BitDepthParams>(
        &gemm_context, lhs.const_map(), rhs.const_map(), &result.map(), -128,
        -128, output_pipeline);
  }

So, we implement multi-thread using openmp.