Typos in paper?
chrisoffner opened this issue · comments
In section 3.1. under Discussion it says
Using a generic architecture allows to leverage strong pretraining technique, ultimately surpassing what existing task-specific architectures can achieve.
Should this be "techniques" or "a strong pretraining technique"?
In section 3.3. under Recovering intrinsics the paper states
hence only the focal
$f_1^∗$ remains to be estimated.
Should this should say "focal length
Moreover, equation (1) states
$$X^{n, m} = P_m P_n^{-1} h (X^n)$$ with$P_m, P_n \in \mathbb{R}^{3 \times 4}$ the world-to-camera poses for images$n$ and$m$ ...
Maybe this is me just nitpicking, but for the matrix inverse
Am I correct in assuming that
Thanks for picking up the typos!
Regarding the last point, yes the world2cam poses are usually 3x4 matrices, and you can convert them to homogeneous before inversion. You could also manually invert the rotation and translation parts like:
This practice seems standard enough to keep the text as is.