Question about the result of: dSigma_dM = 2 * M

Question

Question about the result of: dSigma_dM = 2 * M

yyzzyy78 opened this issue a month ago · comments

Hi, I don't understand the result of dSigma_dM = 2 * M.
https://github.com/graphdeco-inria/diff-gaussian-rasterization/blob/59f5f77e3ddbac3ed9db93ec2cfe99ed6c5d121d/cuda_rasterizer/backward.cu#L315

I know the M=R@S, and the covariance matrix Sigma=M@transpose(M), but Sigma is a [3,3] matrix, and M is a matrix of [3,3] too. Maybe the result of dSigma_dM has the shape of [9,9]?
I understand the covariance matrix is symmetric, but I don't know the influence on dSigma_dM.

yzy commented a month ago

@kwea123

PanagiotisP · Answer 1 · Mon Apr 29 2024 20:52:27 GMT+0800 (China Standard Time)

You are right that dSigma_dM isn't the matrix shown, as it is a matrix w.r.t. matrix derivative that cannot be written in matrix form (it is a tensor, as mentioned also at Wikipedia). I would say that this is a small abuse of terminology, but it makes sense. I'll try to explain below.

The derivatives in the backward pass are always thought of in a chain rule way. You start with a scalar $L$, which is the loss and you build the chain rule with that as a nominator.
To write chain rules that include scalar-to-matrix and even matrix-to-matrix derivatives, the Frobenius inner product is used. That's basically an element-wise multiplication of the two matrices and then summation over the resulting products.

So, using that, the derivative above would be computed as follows. $x$ he is a scalar which $M$ is dependent on.

$$\dfrac{\partial L}{\partial x} = \langle \dfrac{\partial L}{\partial \Sigma_{3D}}, \dfrac{\partial \Sigma_{3D}}{\partial x} \rangle = \langle \dfrac{\partial L}{\partial \Sigma_{3D}}, \dfrac{\partial (M M^T)}{\partial x} \rangle = \langle \dfrac{\partial L}{\partial \Sigma_{3D}}, \dfrac{\partial M}{\partial x}M^T \rangle + \langle \dfrac{\partial L}{\partial \Sigma_{3D}}, M\dfrac{\partial M^T}{\partial x} \rangle =$$

$$\langle \dfrac{\partial L}{\partial \Sigma_{3D}}M, \dfrac{\partial M}{\partial x}\rangle + \langle M^T\dfrac{\partial L}{\partial \Sigma_{3D}}, (\dfrac{\partial M}{\partial x})^T \rangle = \langle \dfrac{\partial L}{\partial \Sigma_{3D}}M, \dfrac{\partial M}{\partial x}\rangle + \langle \dfrac{\partial L}{\partial \Sigma_{3D}}M, \dfrac{\partial M}{\partial x} \rangle = 2\langle \dfrac{\partial L}{\partial \Sigma_{3D}}M, \dfrac{\partial M}{\partial x}\rangle$$

(I used some properties of the Frobenius inner product and the fact that $\dfrac{\partial L}{\partial \Sigma_{3D}}$ is symmetric)
For $x = M_{ij}$ the right matrix has $1$ in position $i j$ and 0 elsewhere, so you basically take the $i j$ entry of the left-hand matrix. So, the whole scalar-to-matrix derivative can be written as:

$$\dfrac{\partial L}{\partial M} = 2\dfrac{\partial L}{\partial \Sigma_{3D}}M$$

which is what the code has (in row-major form, so everything is transposed, hence the different order of multiplications).

In general, if you get the hang of the Frobenius inner product, you can derive the full backward pass yourself, as it basically requires you to apply it over and over, as you progress along the chain.

yzy · Answer 2 · Mon Apr 29 2024 22:37:28 GMT+0800 (China Standard Time)

Great! Thanks for your detailed reply. I think I understand your formula derivation.
Thanks again! @PanagiotisP