chr5tphr / zennit

Zennit is a high-level framework in Python using PyTorch for explaining/exploring neural networks using attribution methods like LRP.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Binary classification model with one output for output layer

wt12318 opened this issue · comments

commented

Hi,

I want to use zennit to calculate LRP relevance for Binary classification model which only one neuron in output layer, how to set the second para of attributor?

class test_model(torch.nn.Module):
    def __init__(self):
        super().__init__()
        self.lin1 = torch.nn.Linear(3 * 512 + 512, 1024)
        self.lin2 = torch.nn.Linear(1024, 512)
        self.lin3 = torch.nn.Linear(512, 1)

    def forward(self, x):
        out = torch.relu(self.lin1(x))
        out = torch.relu(self.lin2(out))
        out = self.lin3(out)
        return out
model = test_model()

from zennit.composites import EpsilonPlusFlat
composite = EpsilonPlusFlat()
test_input.requires_grad = True
from zennit.attribution import Gradient
attributor = Gradient(model , composite)
with attributor:
    output, relevance = attributor(test_input, torch.tensor([1]))

Hey Tao,

in your case, it should be safe to simply use the sign of the class you would like to attribute as your assigned output relevance, i.e. if you would like to attribute the negative class:

with attributor:
    output, relevance = attributor(test_input, torch.tensor([-1.]))

As this flip of the sign influences only the scale of the relevance, it should only flip the sign of your relevance if you use e.g. ZBox in the input layer (I think also for Flat).

Let me know in case this gives unexpected results, we have a few people working on regression models, who should be encountering the a similar problem and might know a little more in that case.