intel / intel-npu-acceleration-library

Intel® NPU Acceleration Library

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

cannot compile `nn.Sequential` into float32

wcshds opened this issue · comments

When I wrap nn.Linear in nn.Sequential, it fails to compile properly into a float32 model.

import intel_npu_acceleration_library
import torch
from torch import nn

model = nn.Sequential(
    nn.Linear(128, 512)
)
print(model)
model = intel_npu_acceleration_library.compile(model, dtype=torch.float32)

input  = torch.randn((4, 128))
model(input)

image

HI, float32 is not a supported dtype for now. However for performance point of view I suggest you to use float16 or quantized datatypes.

HI, float32 is not a supported dtype for now. However for performance point of view I suggest you to use float16 or quantized datatypes.

Thank you. But I notice that float32 is used in the example train_mnist.py. Is this a typo?

float32 dtype is mostly used for training API that is still quite experimental. For pure inference you should go for float16 or lower dtype to fully utilize NPU acceleration

float32 dtype is mostly used for training API that is still quite experimental. For pure inference you should go for float16 or lower dtype to fully utilize NPU acceleration

Thanks for your explanation.