hzxie / RMNet

The official implementation of "Efficient Regional Memory Network for Video Object Segmentation". (Xie et al., CVPR 2021)

Home Page:https://haozhexie.com/project/rmnet/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

About network training

suhwan-cho opened this issue · comments

Hi, thanks for your great work!

I understand that TinyFlowNet is trained with pre-computed optical flows of FlowNet (for DAVIS) and RAFT (for YouTube-VOS) that are trained from other datasets.
I want to know how TinyFlowNet and RMNet are trained during static image training stage, as there are no pre-computed results for those training samples.

Following STM, we apply the affine transformation to synthesize videos from static images.
The affine transformation parameters are saved and used to generate the optical flow.

tr_matrix = self._get_inverse_affine_matrix(center, degrees, translate, scale, shears)
tr_matices.append(tr_matrix)
frames[idx] = self._affine(f, tr_matrix, fillcolor=tuple(self.frame_fill_color))
masks[idx] = self._affine(m, tr_matrix, fillcolor=self.mask_fill_color)
for idx, of in enumerate(optical_flows):
# Skip the first frame
if idx == 0:
continue
# Update the optical flow values
optical_flows[idx] = flow_affine_transformation.update_optical_flow(
of, tr_matices[idx - 1], tr_matices[idx])
optical_flows[idx] = self._affine(optical_flows[idx],
tr_matices[idx],
fillcolor=tuple(self.optical_flow_fill_color))