Binary classification model with one output for output layer

Question

Binary classification model with one output for output layer

wt12318 opened this issue a year ago · comments

Hi,

I want to use zennit to calculate LRP relevance for Binary classification model which only one neuron in output layer, how to set the second para of attributor?

class test_model(torch.nn.Module):
    def __init__(self):
        super().__init__()
        self.lin1 = torch.nn.Linear(3 * 512 + 512, 1024)
        self.lin2 = torch.nn.Linear(1024, 512)
        self.lin3 = torch.nn.Linear(512, 1)

    def forward(self, x):
        out = torch.relu(self.lin1(x))
        out = torch.relu(self.lin2(out))
        out = self.lin3(out)
        return out
model = test_model()

from zennit.composites import EpsilonPlusFlat
composite = EpsilonPlusFlat()
test_input.requires_grad = True
from zennit.attribution import Gradient
attributor = Gradient(model , composite)
with attributor:
    output, relevance = attributor(test_input, torch.tensor([1]))

Christopher · Answer 1 · Fri Jun 16 2023 21:13:38 GMT+0800 (China Standard Time)

Hey Tao,

in your case, it should be safe to simply use the sign of the class you would like to attribute as your assigned output relevance, i.e. if you would like to attribute the negative class:

with attributor:
    output, relevance = attributor(test_input, torch.tensor([-1.]))

As this flip of the sign influences only the scale of the relevance, it should only flip the sign of your relevance if you use e.g. ZBox in the input layer (I think also for Flat).

Let me know in case this gives unexpected results, we have a few people working on regression models, who should be encountering the a similar problem and might know a little more in that case.