Question on Eq. 4

Question

Question on Eq. 4

denkorzh opened this issue a year ago · comments

Hi, thanks for sharing your results!

I'm afraid I did not really get how you derived Eq. 4: if I'm not mistaken,
∇ p(c ∣ xₜ) = − λ ∇ ℰ (c, xₜ) + λ ∇ 𝔼 [ ℰ (c, xₜ) ],
where the gradient ∇ is w.r.t. xₜ, and the expectation 𝔼 is over p(c ∣ xₜ).

Why have you decided to ignore the second term? Thank you in advance!

Jiwen Yu · Answer 1 · Sat Mar 25 2023 14:34:56 GMT+0800 (China Standard Time)

Hi, @denkorzh Thanks for your attention!
I get the Eq.4 just considering the Z in Eq.3 is a constant, then we discard this term after computing the $\nabla_{\mathbf{x}_t}\log(\cdot)$.
Could you tell me how you derived this expectation term? Perhaps we can find some better insights from it to help us improve our work! Look forward to your reply:)

xjtupanda · Answer 2 · Sat Mar 25 2023 15:07:13 GMT+0800 (China Standard Time)

Hi, @denkorzh Thanks for your attention! I get the Eq.4 just considering the Z in Eq.3 is a constant, then we discard this term after computing the ∇xtlog(⋅)∇xtlog⁡(⋅)\nabla_{\mathbf{x}_t}\log(\cdot). Could you tell me how you derived this expectation term? Perhaps we can find some better insights from it to help us improve our work! Look forward to your reply:)

@vvictoryuki I have the similar confusion.
Writing the normalizing constant $Z=\int_c exp(-\lambda \mathcal{\epsilon(c, x_t)}) dc$ in expectation form, we have:
$$
Z=\mathbb{E}_{c \in p(c|x_t)} \ [\exp (-\lambda \mathcal{\epsilon(c, x_t)})]
$$
Plugging into $\log p(c|x_t) = -\lambda \mathcal{\epsilon(c, x_t)} - \log(Z)$ and take derivatives on both sides:
$$
\nabla \log p(c|x_t) = -\lambda \nabla \mathcal{\epsilon(c, x_t)} - \frac{\nabla \mathbb{E} [\exp (-\lambda \mathcal{\epsilon(c, x_t)})]}{Z}
$$
But the term Z, more specifically the $\mathbb{E} [\exp (-\lambda \mathcal{\epsilon(c, x_t)})]$ term, if I don't get it wrong, is a function of $x_t$, so the gradient term should not be discarded?

Denis M Korzhenkov · Answer 3 · Mon Mar 27 2023 00:58:15 GMT+0800 (China Standard Time)

@vvictoryuki
Exactly, as @xjtupanda mentioned above, Z is a constant w.r.t. c but is a function of x_t.

Jiwen Yu · Answer 4 · Thu Apr 06 2023 13:58:24 GMT+0800 (China Standard Time)

Thank you for pointing out the shortcomings in our work. We will address the relevant inaccuracies in the latest arXiv version.