burn-import initializer tensor not added to scope
mosure opened this issue · comments
Describe the bug
nodes with input tensors of category initializer
fail to import due to initializer tensor names not being in scope
To Reproduce
import a minimal onnx graph /w initializer tensor, e.g.
class Model(nn.Module):
def __init__(self):
super(Model, self).__init__()
def forward(self, x):
x = F.interpolate(x, size=(2, 2), mode='bilinear', align_corners=True)
return x
Expected behavior
burn-import generated model contains tensor initializers, e.g. _Concat_output_0
is defined in the scope
Actual behavior
burn import fails with _Concat_output_0
is not in scope, panic at this line:
burn/crates/burn-import/src/burn/scope.rs
Line 53 in 71bd5ef
Desktop (please complete the following information):
- OS: [Windows]
Additional context
this occurs in resize
, gather
, mul_scalar
, and sub_scalar
node imports as well (with initializer tensors)
@skewballfox, do you think this is related to #1857?
was this encountered with the version of burn last published in crates.io or on main?
also, I'm a bit confused. is /Concat_output_0
both a node output and an initializer?
this is encountered on main.
here is a more minimal reproducible example which clarifies the /Concat_output_0
initializer/output ambiguity:
python
#!/usr/bin/env python3
# used to generate model: onnx-tests/tests/resize/resize.onnx
import onnx
from onnx import helper, TensorProto, numpy_helper
import numpy as np
def main() -> None:
input_tensor = helper.make_tensor_value_info("input_tensor", TensorProto.FLOAT, [1, 1, 4, 4])
# Define the sizes tensor as an initializer
sizes = np.array([1, 1, 2, 2], dtype=np.int64)
sizes_initializer = numpy_helper.from_array(sizes, name="sizes")
resize_node = helper.make_node(
"Resize",
name="resize_node",
inputs=["input_tensor", "", "", "sizes"],
outputs=["output"],
mode="linear",
)
graph_def = helper.make_graph(
nodes=[resize_node],
name="ResizeGraph",
inputs=[input_tensor],
outputs=[
helper.make_tensor_value_info("output", TensorProto.FLOAT, [1, 1, 2, 2])
],
initializer=[sizes_initializer],
)
model_def = helper.make_model(graph_def, producer_name="resize")
onnx.save(model_def, "resize.onnx")
if __name__ == "__main__":
main()
the same scope import error happens with constant input tensors too, e.g.
python
#!/usr/bin/env python3
# used to generate model: onnx-tests/tests/resize/resize.onnx
import onnx
from onnx import helper, TensorProto, numpy_helper
import numpy as np
def main() -> None:
input_tensor = helper.make_tensor_value_info("input_tensor", TensorProto.FLOAT, [1, 1, 4, 4])
# Define the sizes tensor as a constant node
sizes = np.array([1, 1, 2, 2], dtype=np.int64)
sizes_const_node = helper.make_node(
"Constant",
name="sizes_const",
inputs=[],
outputs=["sizes"],
value=numpy_helper.from_array(sizes)
)
resize_node = helper.make_node(
"Resize",
name="resize_node",
inputs=["input_tensor", "", "", "sizes"],
outputs=["output"],
mode="linear",
)
graph_def = helper.make_graph(
nodes=[sizes_const_node, resize_node],
name="ResizeGraph",
inputs=[input_tensor],
outputs=[
helper.make_tensor_value_info("output", TensorProto.FLOAT, [1, 1, 2, 2])
],
initializer=[],
)
model_def = helper.make_model(graph_def, producer_name="resize")
onnx.save(model_def, "resize.onnx")
if __name__ == "__main__":
main()
@antimora I think this might not be related to lifting intializers or constants
here's some information from the debug print of resize.onnx
prior to any editing:
model_proto: ModelProto {
...
NodeProto {
input: [
"input_tensor",
"",
"",
"sizes",
],
output: [
"output",
],
name: "resize_node",
...
sizes is in the list of initializers.
the two fields with ""
are roi and scales. the two solutions that come to mind is create a dummy entry for empty strings, or rework the api for the functions that might have empty strings as inputs to create a default or None input arg.
a minimal gather
import with Initializer
indices also has a scope panic. gather
imports correctly with Constant
indices.
I don't think gather
has the same empty input names as resize
: https://onnx.ai/onnx/operators/onnx__Gather.html
broken initializer gather python
#!/usr/bin/env python3
# used to generate model: onnx-tests/tests/gather/gather.onnx
import onnx
from onnx import helper, TensorProto, numpy_helper
import numpy as np
def main() -> None:
input_tensor = helper.make_tensor_value_info("input_tensor", TensorProto.FLOAT, [1, 1, 4, 4])
# Define the indices tensor as an initializer
indices = np.array([0, 2], dtype=np.int64)
indices_initializer = helper.make_tensor(
name="indices",
data_type=TensorProto.INT64,
dims=indices.shape,
vals=indices,
)
gather_node = helper.make_node(
"Gather",
name="gather_node",
inputs=["input_tensor", "indices"],
outputs=["output"],
axis=2, # Assuming we want to gather along the height dimension (axis 2)
)
graph_def = helper.make_graph(
nodes=[gather_node],
name="GatherGraph",
inputs=[input_tensor],
outputs=[
helper.make_tensor_value_info("output", TensorProto.FLOAT, [1, 1, 2, 4])
],
initializer=[indices_initializer],
)
model_def = helper.make_model(graph_def, producer_name="gather")
onnx.save(model_def, "gather.onnx")
if __name__ == "__main__":
main()
working constant gather python
#!/usr/bin/env python3
# used to generate model: onnx-tests/tests/gather/gather.onnx
import onnx
from onnx import helper, TensorProto, numpy_helper
import numpy as np
def main() -> None:
input_tensor = helper.make_tensor_value_info("input_tensor", TensorProto.FLOAT, [1, 1, 4, 4])
# Define the indices tensor as a constant node
indices = np.array([0, 2], dtype=np.int64)
indices_const_tensor = helper.make_tensor(
name="indices_const_tensor",
data_type=TensorProto.INT64,
dims=indices.shape,
vals=indices,
)
indices_const_node = helper.make_node(
"Constant",
name="indices_const_node",
inputs=[],
outputs=["indices"],
value=indices_const_tensor,
)
gather_node = helper.make_node(
"Gather",
name="gather_node",
inputs=["input_tensor", "indices"],
outputs=["output"],
axis=2, # Assuming we want to gather along the height dimension (axis 2)
)
graph_def = helper.make_graph(
nodes=[indices_const_node, gather_node],
name="GatherGraph",
inputs=[input_tensor],
outputs=[
helper.make_tensor_value_info("output", TensorProto.FLOAT, [1, 1, 2, 4])
],
initializer=[],
)
model_def = helper.make_model(graph_def, producer_name="gather")
onnx.save(model_def, "gather.onnx")
if __name__ == "__main__":
main()
I'd be curious to know more about the solution for dummy entries of empty input names, what mechanism of burn_import is causing empty names to be ignored?
I'd be curious to know more about the solution for dummy entries of empty input names, what mechanism of burn_import is causing empty names to be ignored?
When we initialize nodes from the proto data all input arguments (which are simply strings, the original names of the arguments) pass through this function. All that get stored in the inputs is the original graph inputs and the node outputs after it has been processed.
I was sort of wrong in my initial assumption: I assumed using an empty string for a HashMap
key wouldn't work, but I just tried it on the rust playground and its allowed. I think we still need to check for empty string names or some placeholder to indicate to use a default value though.
Honestly I'm kind of stumped. I'm going to recreate the gather.onnx
locally tomorrow.
I thought the issue with the initializers might be that the original rename input function renamed arguments to empty string, so maybe there was some "if empty inline tensor" check happening, though when I tried that with size on resize.onnx
, I ran into a different error: Tensor of Kind Int with dim shape None was passed with empty name
.
whatever's happening only with initializers, but not other generated inputs, and not when those initializers are graph inputs.
the same scope import error happens with constant input tensors too, e.g.
...
#!/usr/bin/env python3 # used to generate model: onnx-tests/tests/resize/resize.onnx import onnx from onnx import helper, TensorProto, numpy_helper import numpy as np def main() -> None: input_tensor = helper.make_tensor_value_info("input_tensor", TensorProto.FLOAT, [1, 1, 4, 4]) # Define the sizes tensor as a constant node sizes = np.array([1, 1, 2, 2], dtype=np.int64) sizes_const_node = helper.make_node( "Constant", name="sizes_const", inputs=[], outputs=["sizes"], value=numpy_helper.from_array(sizes) ) ...
Looking at how that constant is defined, that will be lifted. So at the time scope is checked it will only exist as an input for resize.
Do you guys off the top of your head know burn-graph side what is supposed to happen to input args with no parents? are they inlined, stored as variables?
Do you guys off the top of your head know burn-graph side what is supposed to happen to input args with no parents? are they inlined, stored as variables?
@skewballfox, you must be referring to ONNX Constant. Netron does not show individual Constant node if it's referenced by a single Node. ONNX import supports constants (see code). The constant value is stored as an attribute of the model module struct. It's a Param without gradients.
Do you guys off the top of your head know burn-graph side what is supposed to happen to input args with no parents? are they inlined, stored as variables?
@skewballfox, you must be referring to ONNX Constant. Netron does not show individual Constant node if it's referenced by a single Node. ONNX import supports constants (see code). The constant value is stored as an attribute of the model module struct. It's a Param without gradients.
This wouldn't be tied to a constant node because that constant is lifted, and the parent node is deleted.
I was more wondering about the input arguments to nodes that aren't graph inputs or node outputs, so lifted constants (where the original node is deleted like above), initializers, generated arguments.
The reason I ask is because this seems to be triggering when the argument only exist as an input, which is weird because scope is about determining ownership, but these args only exist in one place
adding this line to consume
seems to fix the initializer import:
self.inputs.extend(self.initializers.values().cloned());
burn/crates/burn-import/src/onnx/from_onnx.rs
Line 163 in 71bd5ef
OnnxGraph::inputs
doesn't capture the initializers.
this needs to contain the initializer tensor names when passed to register_input_output
:
burn/crates/burn-import/src/onnx/to_burn.rs
Line 328 in 71bd5ef
edit: nvm, this results in the initializer being a parameter to forward
:
pub fn forward(&self, input1: Tensor<B, 4>, indices: Tensor<B, 1, Int>) -> Tensor<B, 4>
the error seems near this pathway, e.g. node input initializers are not being managed properly by BurnGraph
:
burn/crates/burn-import/src/burn/graph.rs
Line 40 in 71bd5ef
if I convert the initializer to a constant and disable constant lifting, I run into an error where an Int
constant tensor is added to the imported model, however, all of the Initializer::init*
methods only support float tensors:
let constant7: burn::module::Param<Tensor<B, 1, Int>> = burn::nn::Initializer::Zeros
.init([1], device)
.set_require_grad(false);
mismatched types
expected struct `Param<Tensor<B, burn::tensor::Int, _>>`
found struct `Param<Tensor<_, burn::tensor::Float, _>>`
if I convert the initializer to a constant and disable constant lifting, I run into an error where an
Int
constant tensor is added to the imported model, however, all of theInitializer::init*
methods only support float tensors:let constant7: burn::module::Param<Tensor<B, 1, Int>> = burn::nn::Initializer::Zeros .init([1], device) .set_require_grad(false);mismatched types expected struct `Param<Tensor<B, burn::tensor::Int, _>>` found struct `Param<Tensor<_, burn::tensor::Float, _>>`
CC @nathanielsimard and @laggui . Should we make the initializer more generic for other tensors?
here is some work towards generic initializers, it would need more careful handling:
https://github.com/mosure/burn/blob/8122e8b674fcad727c439d8c235b59fc7d7062e1/crates/burn-core/src/nn/initializer.rs#L79
so I've been looking at output for cargo -r -- graph_path ./out
for both the broken graphs and the graphs that I do know work.
-
Something interesting about the HashMap
variables
being checked is that it's only being added to duringregister_use
-
the way
register_future_use
is being called only on inputs, afterregister_use
is called on graph inputs and node outputs. -
These functions are being filtered to only look at the inputs that are tensors, probably because rust allows implicit copying of basic numeric types.
-
This doesn't trigger on reshape, but that's because it looks like reshape doesn't store it's rhs as a tensor. it's just a vector:
Reshape(
ReshapeNode {
input: TensorType {
name: Ident(
unsqueeze1_out1,
),
dim: 5,
kind: Float,
shape: None,
},
output: TensorType {
name: Ident(
unsqueeze2_out1,
),
dim: 6,
kind: Float,
shape: None,
},
shape: [
1,
1,
3,
4,
5,
1,
],
},
)
I suspect that this is going to panic anytime we have a initializer or lifted constant that's typed as a tensor by burn graph. so far all the situations where I know constants are lifted in an op covered by test cases, the lifted values weren't tensors by this point. If you guys can think of a test case that has a lifted tensor for an op that takes a tensor argument, please let me know.