Wrong and non-deterministic results with constant model
bedapisl opened this issue · comments
Hello,
I am getting wrong outputs when running a simple constant ONNX model. The model should always return 0.5, but the first few outputs seems to be corrupted - not 0.5 and even non-deterministic. Code for prediction:
use ndarray::{ArrayBase, CowRepr, Dim, IxDynImpl, Array};
use ort::{Environment, ExecutionProvider, SessionBuilder, Value};
fn main() {
let environment = Environment::builder()
.with_name("Detection")
.with_execution_providers([ExecutionProvider::CPU(
Default::default(),
)])
.build().expect("Failed to create ONNX Runtime environment")
.into_arc();
let session = SessionBuilder::new(&environment)
.expect("Failed to create a session")
.with_model_from_file("constant_model.onnx")
.expect("Failed to load ONNX model");
let input_data_prepared: ArrayBase<CowRepr<'_, f32>, Dim<IxDynImpl>> =
Array::zeros((1, 3, 192, 256)).into_dyn().into();
let input_tensor = Value::from_array(session.allocator(), &input_data_prepared)
.expect("Failed to convert ndarray to Tensor");
let model_output = session
.run(vec![input_tensor])
.expect("ONNX inference failed")[0]
.try_extract::<f32>()
.expect("ONNX result extraction failed");
for x in 0..6 {
for y in 0..3 {
println!("{:?}", model_output.view()[[0 as usize, x as usize, y as usize]]);
}
}
println!("{:?}", model_output.view().shape());
}
Code for generating the constant_model.onnx
:
import torch
import numpy as np
class ConstantModel(torch.nn.Module):
def __init__(self):
super(ConstantModel, self).__init__()
def forward(self, x):
data = np.zeros((1, 3, 6), dtype=np.float32)
data += 0.5
data = np.transpose(data, (0, 2, 1))
return torch.tensor(data)
model = ConstantModel()
input_data = torch.rand((1, 3, 192, 256), dtype=torch.float32)
exported = torch.onnx.dynamo_export(model, input_data)
exported.save("constant_model.onnx")
Example output I am getting - the first few numbers are different in each run.
20032080000000.0
7e-45
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
[1, 6, 3]
I was able to reproduce it on 2 machines - both Debian 12, one has rustc 1.73.0, other has rustc 1.71.1.
I am adding archive with all the files needed for reproduction:
I found out that removing the execution provider definition, specifically this code:
.with_execution_providers([ExecutionProvider::CPU(
Default::default(),
)])
solves the problem.
I would still keep this open, because I think the code with execution provider is valid and should work.