t5-encoder and t5-decoder-with-lm-head-12 models call onnx.Log with zero input values.

Question

t5-encoder and t5-decoder-with-lm-head-12 models call onnx.Log with zero input values.

negiyas opened this issue a year ago · comments

Bug Report

Which model does this pertain to?

The t5-encoder-12 or t5-decoder-with-lm-head-12.onnx models from https://github.com/onnx/models/tree/main/text/machine_comprehension/t5/model .

Describe the bug

These models include onnx.Log operation, but the inputs are not checked if zero.
According to our observation, onnx.Log ops in the model are called with inputs including zero in almost all cases.
Since value of Log 0 is undefined, the inputs of onnx.Log ops should be checked in advance.
c.f. They return correct values on systems using libraries returning -inf for Log(0), but some systems/libraries return NaN or cause runtime errors for Log(0), so that models should not depend on the behavior.

Reproduction instructions

We can check this issue by looking at the model by a model visualizer (e.g. Netron https://netron.app/ ).

Notes

It seems that the PyTorch code for the models does not check inputs of Log in advance, and onnx models for the models are generated based on the code.

https://github.com/huggingface/transformers/blob/main/src/transformers/models/t5/modeling_t5.py

   if bidirectional:
            num_buckets //= 2
            relative_buckets += (relative_position > 0).to(torch.long) * num_buckets
            relative_position = torch.abs(relative_position)
        else:
            relative_position = -torch.min(relative_position, torch.zeros_like(relative_position))
        # now relative_position is in the range [0, inf)
        …
  # The other half of the buckets are for logarithmically bigger bins in positions up to max_distance
        relative_position_if_large = max_exact + (torch.log(relative_position.float() / max_exact)
            / math.log(max_distance / max_exact) * (num_buckets - max_exact)).to(torch.long)
        relative_position_if_large = torch.min(
            relative_position_if_large, torch.full_like(relative_position_if_large, num_buckets - 1)
        )