microsoft / microsoft-ui-xaml

Windows UI Library: the latest Windows 10 native controls and Fluent styles for your applications

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

TreeView with DataTemplateSelector not reliably using selected DataTemplate

Shad0wlife opened this issue · comments

Describe the bug
I have a tree view. I want to fill it with 2 types of data - one that can have children and one that can not. The template is selected with an DataTemplateSelector which checks the node.Content type for the selection.
The data is given in 2 lists that get iterated one after another. The root nodes are given based upon the selection of a combobox - a new selection clears the treeview and repopulates it.
After an Item is added, the task is delayed for a few milliseconds to avoid lagging gui because the layouting can't keep up when adding a hundred elements to the treeview. When starting to add the second type, the ItemTemplateSelector selects the correct template (I check that with Debug logging), but the TreeView still uses the template of the first item type, causing a crash with anInvalidCastException when attempting to cast the node.Content.

This does not happen when I let the TreeView fully populate, but only when I cancel it by clearing the RootNodes and repopulate them with new data.

Steps to reproduce the bug

  • clone https://github.com/Shad0wlife/TreeViewCrashSampleApp
  • Build it
  • Run it
  • Select an element in the Combobox
  • While you see the elements loading, select another element in the combobox
    • this clears the list and should reload it with the new data
  • you should crash with an InvalidCastException now sometimes
    • this does not happen reliably sadly
    • if the list loaded without error once, you will not manage to have it error again in that session, you'll have to restart the program

Expected behavior
The selected template of the DataTemplateSelector is respected for every node added.

Screenshots
The Error with log showing that the correct template was given by the selector:
grafik

The visual representation with non-crashing templates:
It can be seen that the nodes get the corresponding HasUnrealizedChildren value, but the Template used is the wrong one.
grafik

Version Info
Package version: 2.4.0-prerelease.200203002
OS Version: 1809
Project Target: 1903 (required by package)
Project Minimum: 1809

NuGet package version:
Microsoft.UI.Xaml
Package version: 2.4.0-prerelease.200203002

Windows 10 version Saw the problem?
Insider Build (xxxxx)
November 2019 Update (18363)
May 2019 Update (18362)
October 2018 Update (17763) Yes
April 2018 Update (17134)
Fall Creators Update (16299)
Creators Update (15063)
Device form factor Saw the problem?
Desktop Yes
Mobile
Xbox
Surface Hub
IoT

Additional context
This might also be an issue with how I implemented things, so If you see the issue somewhere in my code please tell me!

If you want to see the effect yourself without crashing, you can enable the debugging templates in MainPage.xaml instead of the full Templates. These templates are visually distinct but won't crash due to casts.

EDIT: I kept in the Attributes for my Database code that the full application uses in case they impact the result due to possible extra load or so. That said: the linked repo is obviously a sample app created from a larger project cut down to a standalone program to show the issue. I just wanted to keep all code affecting any of the process from data to layout as intact as possible.

@Shad0wlife I really appreciate the standalone repro. From what you are saying, this sounds like a bit of timing is involved to get the repro. I'm having trouble repro'ing the crash or the issue with the data template selector not picking the correct template unfortunately.

This is what I see
image

Can you share a crash dump ?

Ok I've got some dumps with different manifestations of the error. Turns out it can also happen without switching to a 2nd List. And with the cast happening the other way round if the tree was already loading the expander templates.
I now have

  • 2 dumps where a Point item got the List template after switching between selections
  • 2 dumps where a Point item got the List template without even switching selections
  • 1 dump where a List item got the Point template after switching between selections

The heap dumps are rather large though (~240mb each), I'll have to upload them overnight and I'll have to find a place to upload them to (Ideas appreciated).

I found that quickly scrolling to the newly added items in the treeview while they are loading causes the issue the most consistent. As soon as one List (any one of them) once finished loading it won't happen anymore, even if I load another one, cancel that and then load the third one...
This is a really elusive error.

If I can manage to download VS2019 on a quicker PC with my limited bandwidth, I can also test on 1909 with faster hardware and maybe screencapture the procedure later.

EDIT: Additional Info
I did some more testing in Debug and Release environments:

  • The aggressive scrolling while contents are being added really seems to be involved here. The switch between Combobox selections seems to worsen the effect but is not neccessary.
  • When scrolling back up while Checkpoints are still loading, Checklists that seemingly had loaded with the correct template may now have the checkpoint template. This still only happens if scrolling is done while contents are added. If done later, everything stays as it is.
  • Once the issues manifested (with the non-crashing templates in use), when switching Selections, the issue will persist for the other newly populated nodes - in the exact same spot, no matter the selection:
    grafik
    grafik
    (the selections are List1 and List2, both have 2 lists with the point template in the same spot)

Sorry for the large pictures...

The behaviour of the issue carrying over after clearing the RootNodes and repopulating them again suggests that the Treeview caches the template per index or so somehow? And that breaks under load while items are added while many items have to be drawn by the layout thread?

I have some very bad news...
This is an issue that I could reproduce with a ListView as well - so it's not actually your TreeView's fault.
What now? I guess I'll really have to see if it still happens on 1909, maybe it was already fixed in ListView since 1809?

(Updated my Repository with the ListView stuff as well)

@Shad0wlife If this is an issue with ListView, the fix will have to wait for WinUI3 to fix it unfortunately since ListView is not part of this repo.

Regarding the crash dumps, you can upload to watson, file a feedback hub item and share the feedback link here.

I think I backtracked it to the Virtualization, effectively disabling it (by wrapping the ListView in a Scrollviewer) seems to completely block the issue from appearing.

I found this documentation for Virtualization:
https://docs.microsoft.com/en-us/windows/uwp/debug-test-perf/optimize-gridview-and-listview#container-recycling-with-heterogeneous-collections
applying to my case (mostly the last paragraph) as this is effectively my use case.

I'll investigate further whether this approach (even if very inefficient) can be used as a workaround for now.

EDIT:
Ok. So the issue caused by scrolling actually IS the Virtualization. After disabling that, I don't crash anymore by scrolling around during loading.
But: If I switch to another dataset while loading and the new set has few enough items to fit on screen at once, I crash (or show the wrong template) again. This will only happen if I switch to the short dataset after the long dataset loaded past the displayed area.

I found a temporary workaround for my use case:
Instead of replacing the contents of the TreeViewer.RootNode, I replace the whole TreeViewer. I also have the TreeViewer in a ScrollViewer to disable Virtualization to avoid the bug while scrolling during loading.
For completeness' sake I added the Workaround as a separate Branch to my repo.

For now this issue is somewhat solved to me, but i'll still try to verify the issue on a newer Win10 build.

@Shad0wlife disabling virtualization is a reasonable workaround if you have limited number of items. How many items do you expect the list to have? If there are lots of items, layout can get very slow and your memory consumption will also go up quite a bit. Try out your worst case scenario before you go down that path.

Also, can you check if you hit the issue with x:Bind and Binding or just x:Bind ?

I don't see any Binding or x:Bind that would be relevant until after the bug manifested.
Filling the TreeView's RootNodes for example is done in code, the contents are not bound. And the DataTemplates as well as The DataTemplateSelector are Resources.
The bug in itself is that the wrong DataTemplate gets used by the TreeView (and ListView), proably due to some kind of Template cache or so in virtualization. The crash then only manifests because the Template's contents then cast to the expected type, but at that point the wrong DataTemplate was already chosen.

I think there really won't be more than maybe 200 to 250 items in there, but depending on the level of nesting in the TreeView, there might be more. The full implementation is backed by a Database and the user can add new elements by buttons or context menu. So depending on what the users do, it might grow differently. I'll ask around, but thanks for the heads up!

Terminal's running into this too. I suppose we could disable virtualization, but are there any other workarounds?

Terminal has a DataTemplateSelector and 3 DataTemplates.

Each DataTemplate contains a ListViewItem.

One of the ListViewItems has an automation property set through its x:Uid and a resource.

We must either:

  1. Disable virtualization
  2. Switch to ChoosingItemContainer and move to constructing the ListViewItem in the code behind
    • To do this, we need to manually set the automation property in the code-behind
    • This completely evaporates the point of us using data templates, selectors, or choosing item containers.

From my comment on microsoft/terminal#9487:

We're getting to the things I hate about XAML. It's like... we did all this work to get away from manually setting up specific values in the codebehind or in the xaml document for specific types of items, right? It was good and clean and smart: the DataTemplate provided everything we needed for the data type. And here, we're forced to add detection to ChoosingItemContainer (sp) to set up the container correctly based on the data type. Something the data template should be able to handle. All because of a silly issue.

Env: Win10, VS2019 + C#9.0 or VS2022preview + C#10.0

I encountered the same InvalidCastException problem when using a TreeView combined with a DataTemplateSelector with multiple DataTemplates with different x:DataType classes.

The exception is thrown out from XAML generated code e.g. MainPage.g.cs, and it involves COM code so conditional breakpoint is not working, and the exeption occurs only when there's a lot of items in the tree view, so effectively I have no way to break the running code just before exception occurs and run watch code to check stack data.

Eventually I wrote a MSBuild plugin to configurably modify generated MainPage.g.cs before it is compiled and inserted some condition check and debug output code, and I found some truth and had some guess on root cause of the exception.

Conclusion:

The exception is caused by virtualization mechanism, which recycles and reuses binding objects corresponding to off-screen(may be far from screen visible region) items, and rebind new data objects to them. If data type is provided by DataTemplateSelector with multiple DataTemplates with different x:DataType classes, xaml compiler will generate one binding class for each data type. but virtualization mechanism doesn't check type correspondance between binding object and data object during binding object reuse, on the contrary, it just forceably casts data object to hard coded data type of binding object. So there is chance types of binding object and data object being bound together are mismatch. if so, the exception occurs.

The solutions:

  1. As mentioned above, disable virtulization. Which is sometimes unpractional through.
  2. Avoid using different classes in multiple DataTemplates combined into one DataTemplateSelector. One can use these different classes's base class in all DataTemplates. Code needs to be adapted.
  3. Correct implementation of virtualization, eliminate binding object reuse
  4. Correct implementation of virtualization, discriminate type of bind objects and reuse them only for new data objects of corresponding type.

Condition that will trigger the bug: meet either of 2 conditions:

  1. Expand as many as a few hundreds of tree items, so that there are off-screen items whose binding objects are recycled and to be reused;
  2. expand and collapse tree nodes which has child nodes, so that some binding objects disappear, recycled and to be reused;

Either 1 or 2, there shall be items corresponding to different data types involved.

An example:

I wrote a explorer-like program that shows file tree using TreeView. There are 2 DataTemplates, one is StorageFolderItem, another is StorageFileItem. When I browse into a certain deep path, the InvalidCastException is bound to be thrown. The path is about 10 level deep, and about 20 folders and files (Important) mixed in each level.The deepest level has about 300 files.
Or, if I repeatedly expand and collapse arbitary folders at random order, while total shown items are less than above case, but with opening items increase, eventually the exception will occur.

Comment

I wish this problem be fixed by developers, hopefully using solution 4. because I see no reason data types of different DataTemplates cannot be different. It's so convenient and elegant.

Some key code:

//------------ MainPage.xaml -----------------
The key is x:DataType="local:StorageFolderItem" and x:DataType="local:StorageFileItem" of 2 DataTemplates are different class.

    <Page.Resources>
        <DataTemplate x:Key="StorageFolderTemplate" x:DataType="local:StorageFolderItem">
            <TreeViewItem ItemsSource="{x:Bind Children, Mode=OneWay}" IsExpanded="False" >  <!-- AutomationProperties.Name="{x:Bind Name}" --> <!--  -->
                <StackPanel Orientation="Horizontal">
                    <SymbolIcon Symbol="Folder"/>
                    <!--<Image Width="20" Source="../Assets/folder.png"/>-->
                    <TextBlock Margin="0,0,10,0"/>
                    <TextBlock Text="{x:Bind Name}" />
                </StackPanel>
            </TreeViewItem>
        </DataTemplate>

        <DataTemplate x:Key="StorageFileTemplate" x:DataType="local:StorageFileItem">
            <TreeViewItem AutomationProperties.Name="{x:Bind Name}" DoubleTapped="TreeViewItem_StorageFileItem_DoubleTapped">
                <StackPanel Orientation="Horizontal">
                    <SymbolIcon Symbol="Document"/>
                    <!--<Image Width="20" Source="../Assets/file.png"/>-->
                    <TextBlock Margin="0,0,10,0"/>
                    <TextBlock Text="{x:Bind Name}"/>
                </StackPanel>
            </TreeViewItem>
        </DataTemplate>

        <local:FileItemTemplateSelector x:Key="FileItemTemplateSelector"
            StorageFolderTemplate="{StaticResource StorageFolderTemplate}"
            StorageFileTemplate="{StaticResource StorageFileTemplate}"/>
    </Page.Resources>

//------------ MainPage.g.cs ------------------
This is generated code with inserted debug code. I printed binding object id(here below and code corresponding to DataTemplateSelector where binding object is newed) to make sure that some bind objects are created once and used multiple times, and sometimes in generated code this.dataRoot = (global::StoryBase.StorageFileItem)newDataRoot; the type of argument newDataRoot is not StorageFileItem but StorageFolderItem, so InvalidCastException occurs.
There are other data type cast scenarios in this file that may trigger InvalidCastException.

        private class MainPage_obj6_Bindings : 
            global::Windows.UI.Xaml.IDataTemplateExtension,
            global::Windows.UI.Xaml.Markup.IDataTemplateComponent, 
            global::Windows.UI.Xaml.Markup.IXamlBindScopeDiagnostics, 
            global::Windows.UI.Xaml.Markup.IComponentConnector, 
            IMainPage_Bindings
        {         
			......

            public bool SetDataRoot(global::System.Object newDataRoot) 
            {                                                          
                if (newDataRoot != null)                               
                {                                                      
				        long id = objids.GetId(this, out _);                                                                 // Modified
				        Debug.WriteLine($"MainPage_obj6_Bindings.SetDataRoot(): bindingId = {id}, DataRoot = {newDataRoot}"); // Modified
				        if (newDataRoot is not StorageFileItem) {                                                            // Modified
				            Debug.WriteLine($"Type cast error: {newDataRoot} is not StorageFileItem");                       // Modified
				        }                                                                                                    // Modified
                    this.dataRoot = (global::StoryBase.StorageFileItem)newDataRoot;
                    return true;                                                   
                }                                                                  
                return false;                                                      
            }    

This issue is stale because it has been open 180 days with no activity. Remove stale label or comment or this will be closed in 5 days.

Still exists!

Any news on that ?
I'm facing the same problem using a treeview with multiples DataTemplate using a selector.
Yes Putting the treeview in a ScrollView to disable Virtualization is working, but it's also a MAJOR performance hit on the app (lots of 1st level items).

Thanks