tjuskyzhang / Scaled-YOLOv4-TensorRT

Got 100fps on TX2. Got 500fps on GeForce GTX 1660 Ti. If the project is useful to you, please Star it.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

How can i make it more faster??

tuneshverma opened this issue · comments

Is there a way to make this loop more faster?

for (int i = 0; i < INPUT_H * INPUT_W; i++) {
data[b * 3 * INPUT_H * INPUT_W + i] = pr_img.atcv::Vec3b(i)[2] / 255.0;
data[b * 3 * INPUT_H * INPUT_W + i + INPUT_H * INPUT_W] = pr_img.atcv::Vec3b(i)[1] / 255.0;
data[b * 3 * INPUT_H * INPUT_W + i + 2 * INPUT_H * INPUT_W] = pr_img.atcv::Vec3b(i)[0] / 255.0;
}

Is there a way to make this loop more faster?

for (int i = 0; i < INPUT_H * INPUT_W; i++) {
data[b * 3 * INPUT_H * INPUT_W + i] = pr_img.atcv::Vec3b(i)[2] / 255.0;
data[b * 3 * INPUT_H * INPUT_W + i + INPUT_H * INPUT_W] = pr_img.atcv::Vec3b(i)[1] / 255.0;
data[b * 3 * INPUT_H * INPUT_W + i + 2 * INPUT_H * INPUT_W] = pr_img.atcv::Vec3b(i)[0] / 255.0;
}

You can try to preprocess the input image on GPU.

would using cv::normalize() be better here?