onnx / models

A collection of pre-trained, state-of-the-art models in the ONNX format

Home Page:http://onnx.ai/models/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Is convolution node 486 in Faster R-CNN working fine?

Apisteftos opened this issue · comments

Bug Report

Which model does this pertain to?

Model faster R-CNN Opset 12

Describe the bug

I am doing profiling with faster RCNN and calculating the Throughput in TOPs is 1321 TOPs which is really high over the limits of the NVIDIA A100 GPU. Can somebody explain me if the model works properly?

Reproduction instructions

System Information

OS Platform and Distribution (Linux Ubuntu 22.04):
ONNX version (1.14):
Backend/Runtime version (Onnexruntime 1.15):

Here my profiling data:

FP32
dur: 70
486_kernel_time
output_type_shape: ( 1, 256, 200, 392)
input_type_shape: (1, 256, 200, 392)
kernel_shape : (256, 256, 3, 3)
bias: 256
provider: CUDAExecutionProvider
op_name: Conv
Throughput: 1321 TOPs

Notes

A100 specs is: Peak FP32 TFLOPS (non-Tensor) = 19.5