Wuyxin / DIR-GNN

Official code of "Discovering Invariant Rationales for Graph Neural Networks" (ICLR 2022)

Home Page:https://arxiv.org/abs/2201.12872

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

How the practical objectives match the theoretical ones?

jugechengzi opened this issue · comments

Thank you very much for sharing such a good paper!
But I am a little confused by the practical objectives (Equation 11, 10, and 9), may be due to the reason that I have missed something.
I have read Appendix E, but I still have no idea about how Equation 11 helps to achieve Equation 4 or 3. Could you please give me some help?

Hi,

Intuitively, Eq 11 optimizes both the shortcut classifier and the DIR objective. For the optimization of the shortcut classifier, the reason that we isolated it is because we don't want the shortcut prediction to affect the graph encoder. And the DIR objective directly corresponds to the target we want.

One way to view Figure 3 vs. Figure 2 is that, instead of doing intervention in the graph level (actually concatenate two graphs or remove edges), we do intervention in the representation level in Figure 3.

In this case, \hat{y} is the model prediction with both S and C. And Eq. 9 computes the risk in each environment (each row in Figure 2). Since our principle is that whatever S is, the risk should be both small and invariant. Thus we have (informally) R_{DIR} = E[Eq. 9] + Var(Eq. 9).

Thany your very much for your reply. After your explanation, my major question becomes why you say "And Eq. 9 computes the risk in each environment (each row in Figure 2). ". Are there any formal analysis or references about Eq .8?

Hi, do you mean Eq .8 or Eq. 9? For Eq. 8 we provide the reference in the Appendix. And for Eq.9 it's just iterating over different interventions and taking the expectation.

Thank you very much. Now I think I somewhat get it!