megvii-model / FunnelAct

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

how to apply in PSPNet

qbTrible opened this issue · comments

Hi,when I use FunnelAct in ResNet50 as the backbone of PSPNet with my own datasets,the mIoU is lower than the case of not use FunnelAct with the same epochs of 200. Is it necessary to use FunnelAct in PyramidPooling module of PSP? Whether it has different performance on different datasets?

commented

I also encountered a similar situation.I don’t know how to deal with two situations:
(1) Only use FunnelAct after the regular convolutional layer?
(2) Replace all Relu in the network structure with FunnelAct?

commented

Hi @qbTrible, applying FReLU in the backbone is enough to obtain improvements. For the dataset question, besides the open datasets in our experiments, I have assisted others to apply the FReLU backbones on their own datasets (for example, face recognition datasets), which also show great improvements. To deal with your issue, could you please provide more detailed information?

  • did you use the code we provided? did you replace all ReLU layers or some of them with our FReLU?
  • what is your dataset property? did you encounter overfitting problem? if your dataset is small, I suggest more regularization & data augmentation in your training strategy, and replace fewer ReLU layers with FReLU at first.
commented

Hi @bobo0810 ,
(1) yes, we simply replace ReLU with FReLU;
(2) you do not have to change all ReLU, it is flexible to modify the ReLU network. Specifically, on your own dataset or task, at first, we suggest to change as few layers as possible (for example, only change 1 ReLU layer in each block, e.g. the ReLU after conv3x3). Then you can gradually replace more ReLU layers until you encounter overfitting problem.

Hi @qbTrible, applying FReLU in the backbone is enough to obtain improvements. For the dataset question, besides the open datasets in our experiments, I have assisted others to apply the FReLU backbones on their own datasets (for example, face recognition datasets), which also show great improvements. To deal with your issue, could you please provide more detailed information?

  • did you use the code we provided? did you replace all ReLU layers or some of them with our FReLU?
  • what is your dataset property? did you encounter overfitting problem? if your dataset is small, I suggest more regularization & data augmentation in your training strategy, and replace fewer ReLU layers with FReLU at first.

I use pytorch as follows:
`
class FReLU(nn.Module):
r""" FReLU formulation. The funnel condition has a window size of kxk. (k=3 by default)
"""

def __init__(self, in_channels):
    super().__init__()
    self.conv_frelu = nn.Conv2d(in_channels, in_channels, 3, 1, 1, groups=in_channels)
    self.bn_frelu = nn.BatchNorm2d(in_channels)

def forward(self, x):
    x1 = self.conv_frelu(x)
    x1 = self.bn_frelu(x1)
    x2 = torch.stack([x, x1], dim=0)
    out, _ = torch.max(x2, dim=0)
    return out

`
My datasets is remote sensing images, and number is small. I replaced all ReLU with FReLU in ResNet50, I will check if it is overfitting.

commented

Hi @qbTrible, applying FReLU in the backbone is enough to obtain improvements. For the dataset question, besides the open datasets in our experiments, I have assisted others to apply the FReLU backbones on their own datasets (for example, face recognition datasets), which also show great improvements. To deal with your issue, could you please provide more detailed information?

  • did you use the code we provided? did you replace all ReLU layers or some of them with our FReLU?
  • what is your dataset property? did you encounter overfitting problem? if your dataset is small, I suggest more regularization & data augmentation in your training strategy, and replace fewer ReLU layers with FReLU at first.

I use pytorch as follows:
`
class FReLU(nn.Module):
r""" FReLU formulation. The funnel condition has a window size of kxk. (k=3 by default)
"""

def __init__(self, in_channels):
    super().__init__()
    self.conv_frelu = nn.Conv2d(in_channels, in_channels, 3, 1, 1, groups=in_channels)
    self.bn_frelu = nn.BatchNorm2d(in_channels)

def forward(self, x):
    x1 = self.conv_frelu(x)
    x1 = self.bn_frelu(x1)
    x2 = torch.stack([x, x1], dim=0)
    out, _ = torch.max(x2, dim=0)
    return out

`
My datasets is remote sensing images, and number is small. I replaced all ReLU with FReLU in ResNet50, I will check if it is overfitting.

There is an ambiguity in our code. Which is the correct implementation?  Thanks! @qbTrible @nmaac
The shapes of x and x1 are [B,C,W,H], torch.max() means to element-wise comparison and return the maximum value[B,C,W,H]
0ef7c14ceb649c04bf098af7a487ed8

Hi @qbTrible, applying FReLU in the backbone is enough to obtain improvements. For the dataset question, besides the open datasets in our experiments, I have assisted others to apply the FReLU backbones on their own datasets (for example, face recognition datasets), which also show great improvements. To deal with your issue, could you please provide more detailed information?

  • did you use the code we provided? did you replace all ReLU layers or some of them with our FReLU?
  • what is your dataset property? did you encounter overfitting problem? if your dataset is small, I suggest more regularization & data augmentation in your training strategy, and replace fewer ReLU layers with FReLU at first.

I use pytorch as follows:
`
class FReLU(nn.Module):
r""" FReLU formulation. The funnel condition has a window size of kxk. (k=3 by default)
"""

def __init__(self, in_channels):
    super().__init__()
    self.conv_frelu = nn.Conv2d(in_channels, in_channels, 3, 1, 1, groups=in_channels)
    self.bn_frelu = nn.BatchNorm2d(in_channels)

def forward(self, x):
    x1 = self.conv_frelu(x)
    x1 = self.bn_frelu(x1)
    x2 = torch.stack([x, x1], dim=0)
    out, _ = torch.max(x2, dim=0)
    return out

`
My datasets is remote sensing images, and number is small. I replaced all ReLU with FReLU in ResNet50, I will check if it is overfitting.

There is an ambiguity in our code. Which is the correct implementation?  Thanks! @qbTrible @nmaac
The shapes of x and x1 are [B,C,W,H], torch.max() means to element-wise comparison and return the maximum value[B,C,W,H]
0ef7c14ceb649c04bf098af7a487ed8

Right!

commented

Hi @qbTrible, applying FReLU in the backbone is enough to obtain improvements. For the dataset question, besides the open datasets in our experiments, I have assisted others to apply the FReLU backbones on their own datasets (for example, face recognition datasets), which also show great improvements. To deal with your issue, could you please provide more detailed information?

  • did you use the code we provided? did you replace all ReLU layers or some of them with our FReLU?
  • what is your dataset property? did you encounter overfitting problem? if your dataset is small, I suggest more regularization & data augmentation in your training strategy, and replace fewer ReLU layers with FReLU at first.

I use pytorch as follows:
`
class FReLU(nn.Module):
r""" FReLU formulation. The funnel condition has a window size of kxk. (k=3 by default)
"""

def __init__(self, in_channels):
    super().__init__()
    self.conv_frelu = nn.Conv2d(in_channels, in_channels, 3, 1, 1, groups=in_channels)
    self.bn_frelu = nn.BatchNorm2d(in_channels)

def forward(self, x):
    x1 = self.conv_frelu(x)
    x1 = self.bn_frelu(x1)
    x2 = torch.stack([x, x1], dim=0)
    out, _ = torch.max(x2, dim=0)
    return out

`
My datasets is remote sensing images, and number is small. I replaced all ReLU with FReLU in ResNet50, I will check if it is overfitting.

There is an ambiguity in our code. Which is the correct implementation?  Thanks! @qbTrible @nmaac
The shapes of x and x1 are [B,C,W,H], torch.max() means to element-wise comparison and return the maximum value[B,C,W,H]
0ef7c14ceb649c04bf098af7a487ed8

Hi @bobo0810 , I suggest to use torch.max(x, x1) directly. Although the implementation using torch.stack seems to be correct but is weird and inefficient.
Hi @qbTrible , since you mentioned your dataset is small which is easy to encounter overfitting problem, please try only change 1 ReLU layer in each block (e.g. the ReLU after conv3x3), or only in the shallower blocks.

commented

@qbTrible , moreover, ResNet50-FReLU even outperforms ResNet-101-ReLU in our experiment, you could also try shallower ResNet with FReLU if you meet an overfitting problem in your task & dataset.

@qbTrible , moreover, ResNet50-FReLU even outperforms ResNet-101-ReLU in our experiment, you could also try shallower ResNet with FReLU if you meet an overfitting problem in your task & dataset.

OK,thank you!