Inconsistent inference results between PyTorch and converted TensorRT model using with GumbelSoftmax operator

Question

Inconsistent inference results between PyTorch and converted TensorRT model using with GumbelSoftmax operator

Thrsu opened this issue 8 months ago · comments

Description:

I'm experiencing a discrepancy between the inference results of PyTorch model and the TensorRT model obtained by converting it using the torch2trt tool.

Reproduce

This issue can be reproduced by the following script:

import torch
from torch.nn import Module
from torch2trt import torch2trt

para_0 = torch.randn([5, 5], dtype=torch.float32).cuda()
para_1 = 2.0
class gumbel_softmax(Module):
    def forward(self, *args):
        return torch.nn.functional.gumbel_softmax(args[0], para_1,)
model = gumbel_softmax().float().eval().cuda()
model_trt = torch2trt(model, [para_0])

output = model(para_0)
trt_output = model_trt(para_0)
print(torch.max(torch.abs(output - trt_output)))

The output is:

tensor(0.7922, device='cuda:0')

Environment

torch: 2.1.1
torch2trt: 0.4.0
tensorrt: 8.6.1