princeton-vl / RAFT

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

BasicEncoder ResidualBlock dimensions do not match with paper

christian-rauch opened this issue · comments

In the paper, the full (RAFT 4.8M) encoder has residual blocks with output sizes of 64 (1 x 64), 128 (2 x 64), and 192 (3 x 64), which follows the same pattern as for the smaller (RAFT-S 1M) encoder with 32 (1 x 32), 64 (2 x 32) and 96 (3 x 32).

In the code, this matches for the small encoder:

RAFT/core/extractor.py

Lines 216 to 218 in aac9dd5

self.layer1 = self._make_layer(32, stride=1)
self.layer2 = self._make_layer(64, stride=2)
self.layer3 = self._make_layer(96, stride=2)

but for the full encoder, those values do not match:

RAFT/core/extractor.py

Lines 139 to 141 in aac9dd5

self.layer1 = self._make_layer(64, stride=1)
self.layer2 = self._make_layer(96, stride=2)
self.layer3 = self._make_layer(128, stride=2)

Instead of [1,2,3] x 64, the pattern is here [2,3,4] x 32, so that instead of doubling the size of the feature vectors from small to full encoder, there is only a linear increase of 32.

Is this intended?