maekawatoshiki / altius

Small ONNX inference runtime written in Rust

Repository from Github https://github.commaekawatoshiki/altiusRepository from Github https://github.commaekawatoshiki/altius

Fast `softmax` kernel

maekawatoshiki opened this issue · comments

  • We need a better implementation (for CPU backend) for softmax.
runtime softmax in gpt2 (ms)
onnxruntime 1.5
altius 2.4
  • The performance degradation was due to OpenMP. OMP_WAIT_POLICY=active solves this.