makefile / frcnn

Faster R-CNN / R-FCN :bulb: C++ version based on Caffe

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

run yolov3-tiny completed!

a-little-cat opened this issue · comments

I find the reason why yolov3-tiny model failed with current code.

There are two questions need to be solved.

  • change pooling reshape rule in pooling_layer.cpp
if (ceil_mode) {
    // pooled_height_ = static_cast<int>(ceil(static_cast<float>(
    //     height_ + 2 * pad_h_ - kernel_h_) / stride_h_)) + 1;
    // pooled_width_ = static_cast<int>(ceil(static_cast<float>(
    //     width_ + 2 * pad_w_ - kernel_w_) / stride_w_)) + 1;
      pooled_height_ = static_cast<int>((height_+2*pad_h_) / stride_h_);
      pooled_width_ = static_cast<int>((width_+2*pad_h_) / stride_w_);
  } else {
  • shutdown net->num_outputs() check in demo_yolov3.cpp
//CHECK_EQ(net->num_outputs(), 3) << "Network should have exactly three outputs.";  

why i didn't make it a merge request

Darknet and caffe takes different measures to deal with reshaping in pooling_layer.

The changed code does not support caffe model... Unless they retrain the models with the new pooling_layer.

I thought there has a Elegant solution. Adding a new parameter in pooling_layer, or adding a new layer named pooling_yolo.

But this is beyond the scope of my work. Middle-aged people have no right to spend time to satisfy elegance.

Thanks for your project, it helps me!

Thanks for your question and corresponding solution. I am not familiar with the tiny-yolo model, can you explain the problem more concisely?
By the way, the sentence Middle-aged people have no right to spend time to satisfy elegance. seems to be reasonable.

我要放弃英文了....

总体说明

caffe和darknet在pooling层时候的reshape逻辑不同.

caffe的逻辑是kernel在输入blob进行滑动,kernel的右(下)边界和输入blob右(下)边界重合即结束.

darknet的逻辑是kernel在输入blob进行滑动,kernel的左(上)边界和输入blob右(下)边界重合即结束.

这个差异直接影响pooling层输出的宽高,所以在部分情况下darknet和caffe的pooling层的输出blob参数会有细微的不同.

yolov3没有使用pooling,所以正常运行.

具体位置

darknet的pooling,reshape操作在maxpool_layer.c/make_maxpool_layer()函数内/30,31行.

    l.out_w = (w + 2*padding)/stride;
    l.out_h = (h + 2*padding)/stride;

caffe的pooling,shape操作在pooling_layer.cpp/reshape()函数/95,96,97,98行.

  if (ceil_mode) {
    pooled_height_ = static_cast<int>(ceil(static_cast<float>(
        height_ + 2 * pad_h_ - kernel_h_) / stride_h_)) + 1;
    pooled_width_ = static_cast<int>(ceil(static_cast<float>(
        width_ + 2 * pad_w_ - kernel_w_) / stride_w_)) + 1;
}

临时而丑的解决方案

直接在caffe的pooling_layer.cpp修改reshape,将caffe的pooling修改为darknet的实现.从此不再兼容caffe模型.

if (ceil_mode) {
    // pooled_height_ = static_cast<int>(ceil(static_cast<float>(
    //     height_ + 2 * pad_h_ - kernel_h_) / stride_h_)) + 1;
    // pooled_width_ = static_cast<int>(ceil(static_cast<float>(
    //     width_ + 2 * pad_w_ - kernel_w_) / stride_w_)) + 1;
      pooled_height_ = static_cast<int>((height_+2*pad_h_) / stride_h_);
      pooled_width_ = static_cast<int>((width_+2*pad_h_) / stride_w_);
  } else {

(优雅的实现,添加一个pooling_yolo层...)

Ps,修改完pooling后模型可以正常转换,但是demo_yolov3.cpp中的第72行会check失败.我注释掉了它,程序陷入了时而正常,时而内存越界死掉的情况,正在追查...

CHECK_EQ(net->num_outputs(), 3) << "Network should have exactly three outputs.";  

看了下darknet的pooling层代码, 的确如你所说, 在yolo v3之前的代码中的pooling层逻辑与caffe的不同. 但是最新的darknet代码中的pooling层又修改成与caffe一致的了. 添加一个pooling_yolo_v2名字的层是比较好,不容易混淆. 至于内存越界的问题是很难查, 哈哈..