XiaLiPKU / EMANet

The code for Expectation-Maximization Attention Networks for Semantic Segmentation (ICCV'2019 Oral)

Home Page:https://xialipku.github.io/publication/expectation-maximization-attention-networks-for-semantic-segmentation/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

How do you implement the equation(15) in your paper?

Dorispaopao opened this issue · comments

And how to consider the gradient backpropgation in your implement?

For the first question:
I implemented it in the 'train.py'

EMANet/train.py

Line 134 in 9a492d8

self.net.module.ema.mu *= momentum
. Implement it in the EMAU module may be more good-looking. But as the \mu has to be averaged on the whole batch, implementing it in the module needs the 'reduce' operation as in SyncBN. So I just write the line in the 'train.py', where the \mu from all GPUs are already together here.

And how to consider the gradient backpropgation in your implement?

For the second question:

I simple cut off the gradients for the A_E and A_M iterations as

with torch.no_grad():
.
To be honest, there lacks deep exploration of what happens inside the EMA. So EMANet is just a naive exploration on the EM + Attention mechanism. So, I just look forward for more deep analysis by dear followers.