Extension to add tensor/operator metadata/annotation from external text file

Question

Extension to add tensor/operator metadata/annotation from external text file

mciprian13 opened this issue 4 months ago · comments

@lutzroeder Thank a lot for creating this tool. It is very useful for the day-to-day work and I am sure a lot of people are finding it useful as well. Kudos!
Two particular features I came across needing multiple times:

Display in the graph visualizer some custom annotation/metadata for tensors/operators that is not part of the model file/descriptor but comes from other sources such as tools or runtimes. Some examples:

Performance statistics: Display for each operator, in addition to its attributes, some performance statistics such as latency, memory consumption, energy consumption, accuracy, etc that are provided from the model execution runtimes or other analysis tools.
Accuracy statistics: Display for each operator the error for a custom impleementation of that operator relative to a reference.

Modify the styling of the tensors/operators such as color to encode visually the above information. For example:

Color each operator with a heatmap color gradient from blue to red to encode information such as: high error (red), low error (blue) or high computation efficiency (blue), low computation efficiency (red), etc.

Since each model format has a way to identity uniquely tensors and operators, the above two features could be realized by importing into Netron a text file or something (separate from the model file) that capture the meta we want, could be something like:

tensor_id: <tensor name or index>
tensor_meta: <text string that will be added into the tensor description in the visualizer>
tensor_style: <syntax to specify the styling of the tensors - the arrows between operators - such as tickness, color, etc>
....
operator_id: <operator name or index>
operator_meta: <text string that will be added into the operator text description in the visualizer>
operator_style: <syntax to specify the styling of the operators - the boxes - such as background color, font, etc>
....

Would also be nice if the above source file can be provided via command line such that Netron will open automatically a model together with its meta file.

I understand different formats have different ways to identity tensors/operators. For example I mainly use TensorFlowLite in my daily work where tensors/operators are uniquely identified with an index (called "Location") but there's also ONNX where tensors/operators are identified via strings/names (I don't know whether this guarantees uniqueness). But maybe this mechanism is not that hard to be generalized.

In summary the above features allows different users to add other custom information into the visualizer.
Currently we had to do our own visualizer (using graphviz) just to have the two features above.
Would be very useful for the Neutron community to have these features embedded into the tool.

LATER EDIT: I was thinking about how the meta text could be inserted and I see two ways:

Either add in the box itself
Add as part of the on-hover event (popup box which appears when moving the cursor on top) and I think I like better this way since it doesn't clutter the operator box itself.

Lutz Roeder · Answer 1 · Sun Feb 25 2024 12:47:36 GMT+0800 (China Standard Time)

@mciprian13 would users have to write custom tools to produce this data, then write custom code for conversion which would limit how useful such a feature would be? Approaching this from the other end, which broadly adopted tools exist to produce this data and how is this information stored? For example, what would a specific scenario look like that doesn't involve custom tools or new file formats?

Ciprian Mindru · Answer 2 · Mon Feb 26 2024 21:00:48 GMT+0800 (China Standard Time)

@lutzroeder Thanks for the quick response. Indeed there are no broadly adopted tools that I know of that export this information but I know at least in my company (NXP) that we had this need and we forked the entire Netron tool, modified it and created our own version called "Model Tool" that imports and displays this information:

This tool only works together with the other parts of the NXP software and with the NXP hardware targets.
The feature that I proposed would serve multiple purposes and would be an off-the-shelf enabler for other people trying to add some extra info in the graph (annotations) from their own tools/environments.
Me and my team cannot use the "Model Tool" because it's not doing exactly what we need and therefore we would neet to fork it again.
I understand the reserve though. We can let this proposal open for some time and check whether it captures some interest. You can (re)evaluate its importance later.

Lutz Roeder · Answer 3 · Tue Feb 27 2024 12:00:46 GMT+0800 (China Standard Time)

@mciprian13 can this be accomplished without inventing a new intermediate file format. For example, .exe uses .pdb for debug information. Debuggers automatically discover and load these files. Other executable formats have other debug formats. Asking users to convert debug information into an intermediate format is not a good user experience. .onnx has metadata_props which is different from tflite.ModelMetadata in .tflite.

Is an intermediate format actually needed or is this more about what information the tflite.Model object model should load and expose. For example .onnx has NodeProto.metadata_props which get exposed as onnx.Node.metadata. Any number of metadata could be loaded or augmented for any number of formats. The question becomes what these tags mean and which UI features they relate to.

Please create separate issues for specific features and API additions to discuss. Explanations or screenshots of these features would be helpful as well.

See also #204.

Ciprian Mindru · Answer 4 · Tue Feb 27 2024 20:33:50 GMT+0800 (China Standard Time)

@lutzroeder Some points:

I wouldn't tie the metadata format to anything that is operator system specific (Linux vs Windows). Should be OS independent.
I wouldn't tie the metadata format to anything that is model format specific (TFLite vs ONNX). Should be model format independent.

This feature is intended to be as generic as possible. I really don't think the proposed format is something that sofisticated: it could be an XML, YAML that only contains those 6 types of keys with associated values:

tensor_id: <tensor name or index>
tensor_meta: <text string that will be added into the tensor description in the visualizer>
tensor_style: <syntax to specify the styling of the tensors - the arrows between operators - such as tickness, color, etc>
operator_id: <operator name or index>
operator_meta: <text string that will be added into the operator text description in the visualizer>
operator_style: <syntax to specify the styling of the operators - the boxes - such as background color, font, etc>

Also would be important for this file to be a text file based (and not binary format) such that is human readable, easy to understand, easy to edit, etc.
Generating this file from any other meta format is trivial in my opinion therefore I don't think the effort of writing a converter tool (e.g. in Python) to format the information as prescribed above is really significant effort.
This feature externalizes the responsability of knowing everything about the layers outside Netron.
Could be view as an extension (plugin) mechanism if you willl.
It's interesting that #204 could be solved by some external tool and provided to Netron as this meta information I`m proposing.

Lutz Roeder · Answer 5 · Wed Feb 28 2024 11:33:37 GMT+0800 (China Standard Time)

@mciprian13 instead of re-iterating implementation details, what would be a specific scenario we are trying to enable for Netron users which does not require proprietary software.

For example, for "memory consumption" of operators there are multiple available tools and implementations. Which tools exist and what are the tradeoffs they make?

There might be multiple different ways to implement these features. How to enable all variants and gravitate towards the most user friendly options?

For example, #204 mentions in Netscope this computation is included without requiring additional tools = most user friendly. Where in the code is the right place to include an "in box" implementation to compute memory usage for each format, or a fallback if possible. Based on past experiments, engines often don't generalize. There might be different implementations for different formats, ideally computed on in-memory raw model data, not the nodes optimized for visualization.
Model formats or providers offer a custom tool to compute and output such information. Which formats and examples exist? Users would likely prefer to load such information automatically if available, instead of writing or using custom conversion tools.
...

What would the UX for this scenario look like?

Each node shows an additional "memory consumption" property?
Should it also be possible based on a specific metric to highlight the color of the node? What would this look like?
How would the user opt into this view?
Is combined model memory consumption of interest as well?

Ciprian Mindru · Answer 6 · Wed Feb 28 2024 20:52:17 GMT+0800 (China Standard Time)

@lutzroeder I don't know a common format used in the whole industry to provide such metrics. Due to the technology fragmentation, each provider builds it's own benchmarking environment with its own format of providing such metrics.
I would gladly propose a common/popukar format if I knew one (may worth searching for).

Here are some specific examples we are interested in:

Display per-operator error metrics and use color style to highlight the error:
The original TFLite model could look like this:

Let's assume we want to provide to the tool some extra meta in a separate text file such as this:

operator_id: 14
operator_meta: "Absolute max error: 13.0;Absolute average error: 2.0;"
operator_style: "color: 0xFF0000;"

After the meta is imported, the Netron could show that operator like this (edited in Paint to exemplify):

In the picture above, the meta is displayed in two ways:

As operator description along the attributes, inputs, outputs
As a separate box when we hover with the cursor above the operator

I don't think both ways are necessary, you can choose the one that is most aestethically pleasing.
Also the operator is colored with RED (we could have color as well the top bar and not the body of the box).

Similarly we could add performance metrics for each operator such as latency or memory consumption:

operator_id: 14
operator_meta: "Latency: 13.0 ms;Memory consumption: 12kB;"

Any runtime has a memory planner which assigns an address to every tensor. We could see this by adding per tensor metadata:

tensor_id: 56
tensor_meta: "Memory address: 0x10000000;"

This time the meta is for a tensor ("CONNECTION" in Netron's language) but the looks should be similar:

Similarly we could have a logic to add model level metadata:

graph_id: 0
graph_meta: "Some graph level stats;"

About how this text metadata file is loaded by Netron, could be a dedicated button "Load Metadata File" used to specify the path of the metadata file. Upon pressing the button the extra content is loaded and displayed.

The whole idea of this meta is for Netron to NOT decide anything. All the intellgence of computing the metadata or deciding the style is done outside of Netron. Netron only displays the content. The feature is as generic as possible (display some plain text for this operator). I don't think we should add intelligence in Netron to do any computation because any type of metric is very specific to environments, targets, etc. Therefore Netron's responsability is only to display stuff, nothing more.

I know that each model format has its own way of conveying metadata (.e.g. TFLite has an object called Metadata) but the problem with those is that each has specific structure and way to serialize/deserialize. Going plain text is much more flexibile and generic I think.

Indeed your point is correct: we might prefer to use a popular format for this (which I don't know) therefore I am open to any suggestions. I really don't care whether the file is XML, YAML, JSON, CSV, etc as long as I can easily generate it with a script.

Lutz Roeder · Answer 7 · Thu Feb 29 2024 14:08:17 GMT+0800 (China Standard Time)

Due to the technology fragmentation, each provider builds it's own benchmarking environment with its own format of providing such metrics.

Exactly. There are three different approaches:

Users ask for this to work out of the box, see Netscope example.
There are many custom options. Similarly there are many different model formats. Instead of asking users to write custom conversion tools, most formats can be directly opened in Netron.
Proprietary vendor has tools which are not made available to most users but prefers a uniform solution to minimize development costs. As long as the tools are proprietary there is no way for open source contributors to test any solutions in this area.

What does a solution look like that allows for all three options but starts by adding value for users in 1) and 2).

Lutz Roeder · Answer 8 · Thu Feb 29 2024 14:11:50 GMT+0800 (China Standard Time)

Any runtime has a memory planner which assigns an address to every tensor. We could see this by adding per tensor metadata:

Is this runtime information that would usually be stored as metadata in the .tflite or .onnx file? Is there an example how offline memory address optimization is used by a known runtime?

Ciprian Mindru · Answer 9 · Thu Feb 29 2024 18:31:38 GMT+0800 (China Standard Time)

Any runtime has a memory planner which assigns an address to every tensor. We could see this by adding per tensor metadata:

Is this runtime information that would usually be stored as metadata in the .tflite or .onnx file? Is there an example how offline memory address optimization is used by a known runtime?

This information is NOT present in the model file (either TFLite or ONNX).
There are actually two types of solutions to run Neural Network models:

NN compilers (such as TVM, XLA, Glow) that parses the model and does a couple of preprocessing offline before running on target:
- Operator optimizations
- Operator scheduling (decide order of execution)
- Memory planning (assign memory addresses to tensors)
- Tiling (split the computation between multiple cores, either homogeneous or heterogeneous)
- Code generation (generate source code or directly binary for the target using for example an LLVM backend)
NN runtimes (TFLite, TFMicro, OnnxRuntime) that does all the above at runtime while running on the target
In both cases there is a lot of extra information that is derived for running the model and all of that is outside the model.

The model file only serves as a mathematical defintion of the model that needs to be executed.
All the other implementation/runtime details (latency, memory consumption, power consumption, addresses, schedule, tiling) are part of the technology that executes the model. For these tools it would be useful to annotate the original graph with externally provided data about some decisions made by the tool or performance measurements after running the model.

Ciprian Mindru · Answer 10 · Thu Feb 29 2024 18:45:09 GMT+0800 (China Standard Time)

Due to the technology fragmentation, each provider builds it's own benchmarking environment with its own format of providing such metrics.

Exactly. There are three different approaches:

Users ask for this to work out of the box, see Netscope example.

There are many custom options. Similarly there are many different model formats. Instead of asking users to write custom conversion tools, most formats can be directly opened in Netron.

Proprietary vendor has tools which are not made available to most users but prefers a uniform solution to minimize development costs. As long as the tools are proprietary there is no way for open source contributors to test any solutions in this area.

What does a solution look like that allows for all three options but starts by adding value for users in 1) and 2).

I understand where you're coming from. Again I don't know examples of 1) or 2). The interest for me for this feature is right now for developer tools (not end-user tools). It's just that I think this is an effort that was made multiple times internally by us and other companiers so it's just a waste to not make something upstream that other can use off-the-shelf (by others I mean again developers not end-users). Could also be customized/updated/changed in the future after extra feedback.

At this point I think this discussion is more about politics than technology. If you're not interested in this then it's fine :)
But in my opinion Netron would also have to gain extra popularity through such features from point 3).

Lutz Roeder · Answer 11 · Fri Mar 01 2024 11:21:39 GMT+0800 (China Standard Time)

NN compilers (such as TVM, XLA, Glow) that parses the model and does a couple of preprocessing offline before running on target

Is there an example or file? Where do these compilers store a pre-processed memory address for runtime access?

Lutz Roeder · Answer 12 · Wed Mar 06 2024 10:25:05 GMT+0800 (China Standard Time)

Duplicate of #1240 and #1241

Ciprian Mindru · Answer 13 · Wed Mar 06 2024 16:50:15 GMT+0800 (China Standard Time)

@lutzroeder Are you willing to receive a contribution here in github for this feature exactly as described in this thread? If not, then we will have to make a fork and keep it internal.

Lutz Roeder · Answer 14 · Sat Mar 09 2024 10:46:19 GMT+0800 (China Standard Time)

Yes, pull requests are welcome to understand the scenario. This should ideally evolve into 3 separate efforts. One is the UX which seems generally useful for many scenarios (if done right). Another the API or infrastructure how such data can be injected in multiple ways. For specific data formats, there should be real usage samples, evidence that this is useful to many users, a way to try the scenario end-to-end and a plan how these will be maintained.