HHTseng / video-classification

Tutorial for video classification/ action recognition using 3D CNN/ CNN+RNN on UCF101

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

no resnet.eval() in encoder ?

Gateway2745 opened this issue · comments

@HHTseng Hi! Thanks for this repo. I noticed you did not do resnet.eval() in your encoder for frame feature extraction. Since we are freezing the weights of the resnet model, does it help to use the pretrained batch norm params while normalizing with current batch stats?

def forward(self, x_3d):
cnn_embed_seq = []
for t in range(x_3d.size(1)):
# ResNet CNN
with torch.no_grad():
x = self.resnet(x_3d[:, t, :, :, :]) # ResNet
x = x.view(x.size(0), -1) # flatten output of conv
# FC layers
x = self.bn1(self.fc1(x))
x = F.relu(x)
x = self.bn2(self.fc2(x))
x = F.relu(x)
x = F.dropout(x, p=self.drop_p, training=self.training)
x = self.fc3(x)
cnn_embed_seq.append(x)

The parameters of batch norm are not updated. However, running stats will be updated.