zhihu / cuBERT

Fast implementation of BERT inference directly on NVIDIA (CUDA, CUBLAS) and Intel MKL

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

java中测试mklBert计算时间很高

dawson-chen opened this issue · comments

我的测试环境是
32 * Intel(R) Xeon(R) Silver 4110 CPU @ 2.10GHz
Ubuntu 16.04.5 LTS
gcc 5.4.0
MKL: 2019.0.3.20190220
环境变量值如下(之前在tensorflow mkl性能上总结出的,比默认值好一些):
KMP_BLOCKTIME=1
KMP_AFFINITY=granularity=fine,verbose,compact,1,0
KMP_SETTINGS=1
OMP_NUM_THREADS=4
我是用java中TestCuBERT中的代码测试的,打点记时如下:

for(int i=1;i<=10;i++) {
    long start = System.currentTimeMillis();
    Float[] output = new Float[2];
    model.compute(1, input_ids, input_mask, segment_ids, output, OutputType.LOGITS);
    System.out.println("Compute Time" + i + ":" + (System.currentTimeMillis() - start));
}

结果是这样

Compute Time1:472
Compute Time2:190
Compute Time3:178
Compute Time4:148
Compute Time5:240
Compute Time6:153
Compute Time7:215
Compute Time8:156
Compute Time9:171
Compute Time10:178

请问这可能是什么原因导致,谢谢~

Java 有额外的 JNI 调用开销。可以在 c++ 中打点对比一下时间。

resolve and close by #58