Should we add support to ignore cycles on serialization?

Question

Should we add support to ignore cycles on serialization?

Jozkee opened this issue 4 years ago · comments

Background and Motivation

Even though we have covered the blocking issue of not being able to (de)serialize reference loops (#30820, #29900) by adding ReferenceHandler.Preserve to S.T.Json, there is many asks for adding an option equivalent to Json.NET's ReferenceLoopHandling.Ignore.

Motivations for doing this are:

ReferenceHandler.Preserve may be too cumbersome and increases payload size.
ReferenceHandler.Preserve creates JSON incomprehensible by serializers other than S.T.Json and Json.NET.

Proposed API

namespace System.Text.Json.Serialization
{
    public abstract partial class ReferenceHandler
    {   
        public static ReferenceHandler Preserve { get; }
+        public static ReferenceHandler IgnoreCycle { get; }
    }
}

Usage Examples

class Node
{
    public string Description { get; set; }
    public Node Next { get; set; }
}

void Test()
{
    var node = new Node { Description = "Node 1" };
    node.Next = node;
    
    var opts = new JsonSerializerOptions { ReferenceHandler = ReferenceHandler.IgnoreCycle };
    
    string json = JsonSerializer.Serialize(node, opts);
    Console.WriteLine(json); // Prints: {"Description":"Node 1"}. 
    // Note property "Next" being ignored due to cycle detection.

}

Alternative Designs

This new API is being added to ReferenceHandler class since this can be considered as an alternative to deal with references that is more isolated to the circularity problem during serialization.

Comparison with Newtonsoft.Json

void Test()
{
    var node = new Node { Description = "Node 1" };
    node.Next = node;
    
    var settings = new JsonSerializerSettings { ReferenceLoopHandling = ReferenceLoopHandling.Ignore };
    
    string json = JsonConvert.SerializeObject(node, settings);
    Console.WriteLine(json); // Prints: {"Description":"Node 1"}.
}

Comparison with existing ReferenceHandler.Preserve setting in S.T.Json

void Test()
{
    var node = new Node { Description = "Node 1" };
    node.Next = node;
    
    var opts = new JsonSerializerOptions { ReferenceHandler = ReferenceHandler.Preserve };
    
    string json = JsonSerializer.Serialize(node, opts);
    Console.WriteLine(json); // Prints: {"$id":"1","Description":"Node 1","Next":{"$ref":"1"}}. 
}

Risks

One downside is that users are uanble to implement their own ReferenceHandler and cannot make their own "Ignore Cycle" handler. The discrimination between preserve and ignore would occur with an internal flag.

Concerns of adding this feature (users must be aware of these problems when opting-in for it):

Silent loss of data given that the object where the loop is detected would be ignored from the JSON (see example: #40099 (comment)).
Unable to round-trip data. the JSON may differ depending on the order of serialization (e.g: object properties and dictionary elements enumeration is non-deterministic).

David Cantú · Answer 1 · Thu Jul 30 2020 03:20:04 GMT+0800 (China Standard Time)

From #30820 (comment):

I have to agree with many other concearning Ignore - and I'd like to try to add some arguments. Unfortunately video is not optimal for me, so I might have missed good arguments in the videos (but I've read through this issue/comments at least).

My arguments are based on that Preserve is the only good alternative. My reasoning for that: [JsonIgnore] is not feasible whenever you have a two-way relationship and you sometimes need to fetch either type of object (and you want the related objects with it, but not the self-references). Except for that there doesn't seem to be a good workaround?

Like @diegogarber pointed out, it breaks backwards compatibility and can induce significant additional work (e.g. frontend wise)

It causes a significantly larger payload. The original issues first two examples (Ignore vs Preserve) is pretty clear (~twice the amount of lines). For Web API scenarios (which might be the most common scenario for serializing JSON), that's pretty horrendous (even if the added stuff is quite short).

The JSON structure is not obvious from a frontend perspective - which the Ignore structure is (imo). For scenarios with Ignore where you really need the references, you could easily identify that specific call (again, assuming a Web API) and do your ViewModel-layer instead. The Preserve structure on the other hand needs parsing - specifically turning lists (arrays) into collections (objects) will be quite confusing.

To sum up: To break backwards compatibility, cause a significant larger payload, as well as breaking the structure by introducing a middle layer (as in Subordinates not being a list/array of values), should be reason enough to consider having Ignore.

I believe the StackOverflow issue mentioned reflects the community's stand on this - the only reasonable way forward (unless for specific scenarios where you really need the C#/serialization performance) is to continue with Newtonsoft. To me, that should seen as a sign that this is not a reasonable way forward. It's been a long wait between 3.0 -> 5.0 (we're still waiting after all), and to still not having this resolved will definitely continue to stir up frustration.

David Cantú · Answer 2 · Thu Jul 30 2020 03:21:40 GMT+0800 (China Standard Time)

From #30820 (comment)

The problem with Ignore is that you end-up losing data, this will probably cause an issue even more severe than having a large payload or having to do extra parsing in the front-end.

e.g:

static void Main()
{
    var angela = new Employee { Name = "Angela" };
    var bob = new Employee { Name = "Bob", Manager = angela };

    angela.Subordinates = new List<Employee> { bob };

    var settings = new JsonSerializerSettings { ReferenceLoopHandling = ReferenceLoopHandling.Ignore };
    string json = JsonConvert.SerializeObject(bob, settings);
    Console.WriteLine(json);
    // {"Name":"Bob","Manager":{"Name":"Angela","Manager":null,"Subordinates":[]},"Subordinates":null}

    Employee bobCopy = JsonConvert.DeserializeObject<Employee>(json);
    Employee angelaCopy = bobCopy.Manager;

    Console.WriteLine("Number of subordinates for angela: " + angela.Subordinates.Count);
    // Number of subordinates for angela: 1
    Console.WriteLine("Number of subordinates for angelaCopy: " + angelaCopy.Subordinates.Count);
    // Number of subordinates for angelaCopy: 0
    Console.WriteLine(angelaCopy.Subordinates.Count == angela.Subordinates.Count);
    // False
}

@cjblomqvist How is that similar scenarios are not a concern for you or everyone else asking for Ignore?

David Cantú · Answer 3 · Thu Jul 30 2020 03:22:22 GMT+0800 (China Standard Time)

From #30820 (comment)

@Jozkee thanks for your reply!

I believe a scenario which is probably quite common is that you have let's say 100 types, with various relationships between them. Some contian many, some not so many. Basically all are related to at least one other type one way or another, and you have a REST-like Web API to get them from a Web/Mobile application. To map out all relationships in each scenario and make sure you handle all self-referencing scenarios is quite cumbersome. So what happens is, the user is asking for a main type A, and then you ensure you also get all the related types that you believe the "user" might need, such as B, C, D, E, F, G (E, F, G might be sub-relationships). You do not filter this properly (i.e. you do not project out properties not needed) because of not wanting to "waste" time on it (as well as laziness). To map out this tree:

A
A -> B
A -> C
A -> D
A -> B -> E
A -> B -> E -> F
A -> D -> G

So far, all good. The problem is when you also have the following relationships:

C -> A
E -> A
D -> A
G -> A

It's cumbersome to properly keep track of them and filter them out (e.g. through JsonIgnore - which also has the problem of filtering out the relationships in all scenarios, not only for this particular call for A). It would also be cumbersome to handle it using preserve due to the A) wasted space/size to transmit over the network (might be less of a concearn in above scenario since obviously we're not so picky with what we're including and not), and B) the data structure does not map in a fully obvious way to JSON, but with a quite new-JSON-library-in-.NET specific way.

Then, to finally answer your question: The "user" (frontend consumer) simply does not care about the additional relationships [C, E, D, G] -> A - it only wants to know the original tree. The loss of data is irrelevant - because the relationship properties are not relevant.

I'm not say that Ignore is perfect by any means, but it is darn convenient to simply add it to a project and then not having to think about it anymore. Something like Preserve could be very useful, but the current implementation is too cumbersome for most cases (with all the downsides listed above) - it doesn't fulfil that simple-and-convenient-albeit-not-100%-correct scenario. I do understand that there are reasons behind Preserve being the way it is, so I'm not saying the Preserve is wrong. What I'm saying is that Preserve is good for one thing (100% correctness while still handling self-referencing), while Ignore is good for another (simple/convenient, but not 100% correct in some cases).

I do believe a lot of people in the community feel the same.

Something in between Ignore and Preserve might be very useful, especially that doesn't break the assumed data structure (to avoid the frontend parsing/unexpected data structure for lists/arrays) while still not loosing data, but I understand that there are difficulties/complexities with this, so Ignore might be good enough to solve the relevant actual real world scenarios.

Dan Moseley · Answer 4 · Fri Jul 31 2020 01:21:48 GMT+0800 (China Standard Time)

@Jozkee is this required to ship 5.0 (per milestone set here)?

David Cantú · Answer 5 · Fri Jul 31 2020 01:31:13 GMT+0800 (China Standard Time)

@danmosemsft I think it could disappoint many users if we don't ship this on 5.0.

Dan Moseley · Answer 6 · Fri Jul 31 2020 02:19:21 GMT+0800 (China Standard Time)

@Jozkee maybe, but master is a 6.0 branch in less than 3 weeks: if we add features, we're implicitly deciding to leave bugs un-fixed.

That's a general statement - I don't have context on this issue or JSON in general so it would be a good discussion to have with @ericstj .

Eric Sampson · Answer 7 · Mon Aug 10 2020 04:38:23 GMT+0800 (China Standard Time)

@Jozkee @ericstj FWIW when I heard that STJ would have loop handling in 5.0, I just assumed that would include something along the lines of Ignore. It would be disappointing to not have something along those lines in 5.0
Also FWIW I don't love the Newtonsoft options. Like another commenter mention above, it would be great to have an in-between option that didn't suffer the data loss of Ignore; one option that seems great space-wise would be to only emit reference tags where they are needed, although that might require backtracking/rewriting?

Jonathan Allen · Answer 8 · Thu Oct 01 2020 02:50:54 GMT+0800 (China Standard Time)

What's the verdict? Is this going to ship in .net 5?

Eric StJohn · Answer 9 · Thu Oct 01 2020 03:24:02 GMT+0800 (China Standard Time)

As per the milestone this was set to 6.0.0. The feedback around the importance of this came in after feature complete for 5.0 and didn't meet the bar. We will aim to add support here in 6.0.

cjblomqvist · Answer 10 · Thu Oct 01 2020 16:29:32 GMT+0800 (China Standard Time)

@ericstj - even though I can understand it's difficult to prioritize - everyone of course wants to have their piece in every release (which is not possible if wanting to meet the deadlines). I do want to set the record straight though, because unless I'm missing something

the feedback around the importance of this came in after feature compelte for 5.0

doesn't ring very true for me.

First off, I'll assume everybody here knows about the significant outcry that came after releasing 3.0 a year ago about not having feature parity and easy compatibility on this critical feature (JSON handling). Note that it's not only about feature parity, but also about compatibility. Like expressed later, it's simply not fully complete to do a different solution with significant drawbacks, causing a lot of rework for anyone with a web app sending JSON relying on EF with any self-referencing loops anywhere (should be one of the most common applications out there with .NET Core?). Ok, so it was quite clear from the beginning - but of course, feedback about the specific solution didn't come from the start (it didn't exist, so not possible).

According to your releases, you reached "feature complete" with Preview 8, released on Aug 25th (https://devblogs.microsoft.com/dotnet/announcing-net-5-0-preview-8/). You were not "feature complete" (almost, but not fully), with Preview 7 (see first section of https://devblogs.microsoft.com/dotnet/announcing-net-5-0-preview-7/) on Jul 21st.

Ok. Jozkee seems to have announced his plan/proposal on May 5th (#30820 (comment)). The first comment about ignore came 6 days later on May 11th (#30820 (comment)). No reply. Then after that another person commented on Jul 14th quite strongly and imo clearly the issue (#30820 (comment)). No reply. Then I have a try at it in perhaps a more structured approach on Jul 27th (#30820 (comment)), which finally ends up with this issue right here after a quick back-n-forth with Jozkee.

So, unless I'm very much missing something, it's simply not true that

the feedback around the importance of this came in after feature compelte for 5.0

To quote diegogarber (again, from Jul 14th)

@Jozkee can we PLEASE get the Ignore???
A simple scenario:
Let's say that you have your application that's using newtonsoft.json (as most people do) and you swap to system.text.json. What happens? well, you get loop errors. You swap it to preserve and now your front end does NOT WORK!!!!!

To me, that's pretty clear.

If you want to push it to 6.0.0 at this point in time I understand (since we're so close to release). I understand it's challenging to do such great work in an open environment with the community involved. But please, do not blame the lateness of the feedback from the community.

Eric StJohn · Answer 11 · Fri Oct 02 2020 02:15:52 GMT+0800 (China Standard Time)

Sorry @cjblomqvist I didn't mean to blame anyone here. I was stating the facts about the decision made in this issue with its specific dates. On July 15 we branched dotnet/runtime and started stabilizing for Preview8.

Note that it's not like this aspect of the feature wasn't discussed in the design review. My understanding was that there was push back in the design review against the option of letting callers of the serialization API silently drop members from types which they don't own through the cycle-ignore option. @Jozkee calls this out at the top of this issue and it's something that will likely still need to be discussed when bringing this change in to 6.0. We are hearing the community that it's important to reconsider this decision and will get to it. We don't want to rush that decision. The new API needs to be designed and approved. Even once this is approved it will need testing to ensure it meets folks' needs, provides adequate performance, and meets security guarantees. We don't like to rush this sort of thing due to the high-compatibility bar we hold for components in dotnet/runtime, thus our early lock-downs.

I truly apologize that this new 5.0 feature is not meeting your needs, and I hope we can improve it in 6.0.

Eric Sampson · Answer 12 · Fri Oct 02 2020 03:16:19 GMT+0800 (China Standard Time)

@ericstj thanks for keeping the dialog open.
I'm not sure of a good way to have the following considered during the design discussions @Jozkee, but I'd appreciate considering offering two new options:

option equivalent to Newtonsoft 'ignore', for compat/upgrade scenarios.
a new option (PreserveSlim? PreserveMinimal?) that only emits reference tags where they are needed, vs Preserve that emits them everywhere. I understand that this might be slower, but would result in a lot less going over the wire and also a friendlier format for human consumption. It seems like a reasonable option between Ignore and Preserve to allow people to choose.

Thanks!!

cjblomqvist · Answer 13 · Fri Oct 02 2020 15:00:20 GMT+0800 (China Standard Time)

Thanks for your reply @ericstj - communication is definitely important and something that easily gets wrong in written form over the Internet.

I do understand consideration needs to be taken, and it's good you added insight about the branching point for Preview 8 in July 15th. I guess the overall negativity on this stems from the overall disappointment of not being able to migrate to this for 2 years (3.0 -> 6.0), for what looks like (probably wrongly) a quite small feature.

Isn't it ironic that one of the major hurdles for getting this implemented is the implications of needing to keep backwards compatibility on this, when basically the whole issue stems from exactly backwards compatibility with Newtonsoft.... :)

Steve Cross · Answer 14 · Fri Oct 02 2020 22:58:09 GMT+0800 (China Standard Time)

Quote from #30820 (comment)

Regarding ReferenceHandling.Ignore....

I don't see why this wouldn't be useful. Sure, on deserialisation it could throw up some odd results, but I think for serialisation it's actually quitehelpful. The example use case in the very first comment is a perfect example of why it would be useful. I think it would be ideal to have something like this, because attributes like [JsonIgnore] don't really help, as that would just stop that property from serialising ever.

Currently people are suggesting actively dumping System.text.Json, as can be seen in this question on Stack, where 3 of the 4 answers suggest using Newtonsoft.Json, which has the ignore functionality built in. So people are actively gimping the performance of their application to get a fairly basic feature working. Newtonsoft is slow, but at least it works.

EDIT: For reference, I've been using preview 3, I'm yet to try preview 4 but if preserve achieves essentially the same thing without throwing the cycling exception then that would probably be good.

I just want to throw in my two cents on this one. I decided to go down the route of porting my application from Newtonsoft.Json for the sake of performance. I mean, after all, that is one of the huge selling points of System.Text.Json.. I invested hours converting everything over only to find out at the end when I was thoroughly testing everything that I couldn't serialize objects that were pulled by EF core. Talk about a surprise to me!! I even tried the ReferenceHandler.Preserve functionality of 5.0 rc1 and serialization still bombs out due to the circular references.

Anyways, the solution I had to come up with was to use System.Text.Json for the input formatter for MVC because it can handle data coming in just fine, but stick with the slower Newtonsoft.Json as the output formatter, just so I can serialize data from EF core. NOT the ideal solution, but hopefully I will at least get performance benefits on POST/PUT methods.

In short, come on, pushing ReferenceHandler.Ignore to 6.0? I feel this is a pretty basic need and something that, from what I have read, many others just assumed would be available, for the sake of feature parity. Especially since EF core and System.Text.Json are both Microsoft projects, it would seem reasonable that the two could easily be used together.

Please consider adding this to a minor update to 5.0 instead of making us all wait another year to be able to use this new JSON serializer!! Please...

Lee Butler · Answer 15 · Fri Oct 02 2020 23:12:39 GMT+0800 (China Standard Time)

As per the milestone this was set to 6.0.0. The feedback around the importance of this came in after feature complete for 5.0 and didn't meet the bar. We will aim to add support here in 6.0.

I'm sorry but I just straight up disagree with this. @Jozkee highlighted in his initial post on #30820 a perfectly valid use case for .Ignore (which for some reason went unnoticed apparently) back in September 2019, over a year ago!

Discussion ensued in that issue for a while, and for some reason at some point it was decided that ingore wasn't required, despite many community contributions specifically requesting it. Eventually I too commented (#30820 (comment)) requesting it, and that was in May. Many of the comments above are responses which came after then.

There was clear issue with the lack of Ignore for months, and multiple people in the community have been unable to experience the improvements that come with STJ because of this catastrophically broken part of the "mostly drop in replacement" for Newtonsoft which requires either introducing a load of additional parsing (if using Preserve and all the extra stuff that comes with it). I've been asked numerous times what i think about STJ and I've had to say that although overall it is a significant improvement over Newtonsoft, if there's any reliance on newtonsoft's Ignore then it's totally unusable, so the only real option is to nerf the applications which would otherwise use it and use newtonsoft instead.

@cjblomqvist's comment from July (which was already copied above) is an excellent rationale for why we needed this.

Realistically, if .NET 5 is "the future", shooting everyone who uses ignore in the foot is a terrible starting position. It should have been worked in months ago (well, it should have been there from the start), and putting it off until 6 is absolutely ridiculous. @hairlesshobo is just one of many who have been frustrated by this.

Ranting aside, I understand the rationale for not having it, as maybe it shouldn't have been there in the first place. Unfortunately that error was made many years ago, and in the interests of backwards compatibility the code which enables aome "bad" practices should be in place.

That said, in what is probably most instances where this is a problem (where people are serialising EF objects), the "lost" data isn't relevant, as it's usually a child referring to its parent anyway, which referred to its child in the first place.

Eric StJohn · Answer 16 · Mon Oct 05 2020 23:49:51 GMT+0800 (China Standard Time)

A bit more on schedule so folks understand where we are at now. We switched gears from feature development to bug-fixing around mid-July when we branched for Preview8. At that point it was all-hands on deck to get bugs fixed and bring 5.0 up to ship quality. We haven't been doing feature work since then, and new APIs and design changes are feature work (more on that below). We're now pretty much done with bug-fixing for 5.0 and have a chance to start looking at features again.

one of the major hurdles for getting this implemented is the implications of needing to keep backwards compatibility on this, when basically the whole issue stems from exactly backwards compatibility with Newtonsoft

System.Text.Json does not aim to be compatible with Newtonsoft.Json. We're specifically trying to be different to provide a different value proposition. Those differences are actually what drives folks to consider using System.Text.Json instead. We are trying to make System.Text.Json work for as many folks as possible, but ease of adoption is different than compatibility. I don't see a backwards compatibility problem with adding this feature, it should be behind an opt-in flag. I think the main hurdle for this feature was/is design principles.

(which for some reason went unnoticed apparently) back in September 2019, over a year ago!

I respectfully disagree. I see plenty of discussion from @Jozkee @ahsonkhan @steveharter and @JamesNK that acknowledges the scenario. It was also discussed in the API review. There was a good understanding of ReferenceLoopHandling.Ignore and what it's purpose was. There's no conspiracy to suppress this feature. It was an intentional design decision. Sometimes those are even harder to undo, since you need to revisit a decision that was already made, but that is exactly what we're doing here. Please help us make the right call this time around.

If it were me trying to present the case for this feature I'd share things like:

Cost or in-viability of the JsonIgnore workaround. (I already see some of this, thank you @cjblomqvist)
Usage numbers of the Newtonsoft Ignore feature.
Precedent in other serializers (other than Newtonsoft.Json) that support this or some similar behavior. Can be .NET, browsers, etc. Anything to establish precedent for this functionality.
Precedent in the existing System.Text.Json API that permits a caller to modify the contract of a type which they don't own (EG: you can use a custom converter to do this today)
Precedent in the existing System.Text.Json where an option will prevent round-tripping or silent lossy serialization.

I bet this type of data would help the API reviewers better understand the tradeoffs when making this decision. I am not asking the community to provide all this, but just sharing some suggestions for how this issue should proceed.

Next steps are an API proposal and a review. Assuming that goes through, the feature should be implemented in the main branch codebase (6.0). Once that's done, we can talk about porting it to the servicing branch consumable via the package, once we see what the final design looks like and understand the risk. I can make no promises as we typically do not permit API additions in servicing, but since this also ships as a NuGet package we can technically make it work. I suspect the viability of a servicing release here depends on risk.

Thanks everyone.

Lee Butler · Answer 17 · Tue Oct 06 2020 02:31:49 GMT+0800 (China Standard Time)

Thanks Eric for your input.

I just want to clarify that I wasn't just railing against the project/maintainers/contributors, but was trying to establish the view for me (and those who I've spoke to in favour of Ignore) about the situation.

I personally am not even a fan of the current implementation that exists in Json.NET. While I do make use of it, and agree with @JamesNK that it definitely shouldn't be the default because it could in theory cause data to be lost, I also think that there is some way of implementing behaviour similar to what the other implementation did, while maybe doing it more "safely".

I wouldn't want to be the one putting forward the case for the API myself because although I have a few years under my belt, I think things like this might be a bit above me.

I think the core issue with the [JsonIgnore] workaround, as @cjblomqvist has already shown, is that it's not quite the desired result in (what I believe to be) the main "selling point" of the feature, which is the ability to serialise EntityFramework objects (with navigation properties), as it ruins the ability to serialise that property in any scenario, like he said.

There are elements of the Preserve that I like, although I could see incompatibilities between producers and consumers if they aren't aware of the same metadata properties and what they mean. However in instances where parsing is the same on either end, I do like how it's fully reference aware.

There are a couple of rough ideas off the top of my head for approaches to this.

Add the ignore in ReferenceHandling - Restores compatibility with people porting from Newtonsoft, so it's entirely transparent to consumers. Does allow data to be "lost" though.
Add some way of enabling reference loops to be ignored in a "one time" type way - This option would restore compatibility with consumers on calls which could have reference loops, acknowledged by the need to explicitly allow on each call to serialise that references may be dropped from the payload. I personally think this would be the best for users (although porting might be a pain on the producer side), but would most likely be a breaking and/or somewhat major change than other options for the current APIs
Do nothing - While this works as Preserve already seems to do the job for users who currently do this with a metadata-aware Json deserialiser on each end, it may disappoint some users who use the Json data for cross compatibility, as consumers on other platforms may be spooked by the metadata.

I think 2 is the best option because while I think that there is usually no reason to have something like this, for the example @cjblomqvist had, where it's for display purposes only and the data isn't expected to be saved back anywhere after deserialisation, it's not the worst plan. I think the user can understand that in the ignore example on the original #30820 (comment), Angela would obviously be a subordinate to her manager.

I suppose as always, it depends, as in that scenario it's clear from context. I think Ignore should definitely exist, but maybe more as an "exception to the rule" type deal, and maybe not something that can be set globally like it can in Json.NET.

Hope that all makes sense. :)

Steve Harter · Answer 18 · Tue Oct 06 2020 04:42:18 GMT+0800 (China Standard Time)

The main scenario as I understand:

EF is being used to obtain entities that need to be serialized to JSON.
The entities are serialized to JSON which is used by a client or UI that doesn't want to repeat the duplicated information.
The JSON that is missing is not important, and it's OK to have silent-data-loss.

The main question I have is whether the POCO types are general-purpose meaning used for both the scenario above as well as for other data-interchange scenarios.

I assume the POCOs are general-purpose based on the feedback that [JsonIgnore] can't be used on navigation properties since sometimes the POCOs should be serialized with references (probably using "Preserve" feature). For this to work, I assume there would be a JsonSerializerOptions instance specific for the client\UI scenario above that would use the new "Ignore" feature and another JsonSerializerOptions instance used for data-interchange scenarios that would probably use the "Preserve" option.

So as a potential work-around, would using DTOs specific for this scenario (that would not have the properties that should be ignored) work? One could argue that using DTOs in this manner is a best practice. The counter-argument is that this is cumbersome and\or the existing Newtonsoft semantics are perfectly fine.

Assuming DTOs are not the answer, for 6.0 I believe we can tweak the design of ReferenceResolver. The reference handling feature was designed to be extensible by creating a class deriving from ReferenceResolver and specifying that instance on JsonSerializerOptions. An updated design of that that understands "ignoring" would make it possible to implement a resolver that does exactly what Newtonsoft does, or some other implementation such as a deterministic design that looks for a new custom attribute like [IgnoreForUI] on properties plus a way to set that UI mode on the options.

Also, for 6.0, STJ should consider addressing the deterministic issues that have a workaround in Newtonsoft. See #1085.

cjblomqvist · Answer 19 · Wed Oct 07 2020 03:33:15 GMT+0800 (China Standard Time)

@ericstj I hope you didn't read too much into my note about backwards compatibility. It was intended as some kind of (funny) sarcasm (from my experience one of the major hurdles with introducing additional features without thinking them fully through is that you'll then have to live with them due to the issues of breaking changes later - e.g. the type of outcry that the lack of Ignore has caused :)

Anyway! What's done is done. I fully understand, and most rationale people should as well, realize that getting it as a part of 5.0.0 is not reasonable at this point. So, to add to the future.

If the web crowd doing standard EF applications is a targeted user base (or any other affected user base - can't speak for that though), then I believe this needs to be fixed as soon as possible and preferably backported. From my experience the web crowd does not live well with having to use a workaround for such a base scenario (serializing to JSON), in particular not when that solution is to use an old library which is actively being replaced (at least for most scenarios).

As for 2-5, I can't add much unfortunately.

As for @steveharter 's suggestion with DTOs. I assume with general purpose POCOs you're referring to base/EF-mapped entities? To answer your question: Yes, I believe you can do custom DTOs at full depth for all scenarios to get away from self-referencing scenarios. But it will be super cumbersome. Quick example:

->

Base entities
A -> B, C
B -> C
C -> B

In this scenario you'll have to create 5(!) DTOs in order to transfer A with an additional 2 layers. That's a whole lot of boilerplate for something very simple. It will quickly escalate in more complex scenarios. The cost of this will be very significant.
A1 -> B1, C1
B1 -> C2
C1 -> B2
B2 -> -
C2 -> -

There's already a solution to this (as implemented in #30820) - but with some drawbacks (as outlined previously).

Ahson Khan · Answer 20 · Mon Oct 19 2020 12:56:32 GMT+0800 (China Standard Time)

There are a variety of ways to add the ignore cycles functionality to the STJ library. Aside from figuring out the how and design the appropriate solution with an attempt to reduce the downsides, the most important thing that would be great to pin-down is the why. From what I gather, that is the primary purpose of this issue: get community feedback and understand specific user scenarios to motivate the feature.

What is your scenario/reason to use the ignore cycles option and why is the existing preserve option not enough?

Reading through the thread, there are a bunch of comments around "this feature should exist" and its timeline (with ideas on how to solve it), which are definitely interesting, but I am having trouble understanding why folks would actually want the ignore behavior (other than that Newtonsoft.Json had it, and people are used to it).

This is especially important so that the benefits can be weighed against the drawbacks that @Jozkee listed in the original post.

Are all the incoming requests for this feature a ref count for these stated benefits, or are there other reasons?

ReferenceHandler.Preserve may be too cumbersome and increases payload size.

ReferenceHandler.Preserve creates JSON incomprehensible by serializers other than S.T.Json and Json.NET.

What use case is blocked by not having ignore? In most EF/cycle scenarios, especially when it comes to round-tripping objects using JSON, Preserve should be able to solve the issues that folks brought up and you won't risk losing data accidentally, or failing to round-trip.

Why is it that with the current usages of Newtonsoft.Json's Ignore feature, folks don't mind the potential data loss or the fact that payloads can't accurately round-trip after serialized with full fidelity?

From my experience the web crowd does not live well with having to use a workaround for such a base scenario (serializing to JSON)

Similarly, @cjblomqvist, why not use preserve?

Quoting @cjblomqvist from earlier to confirm, is the primary purpose/value of having ignore (I am asking because if convenience is the main motivator, then why is this a pressing/time sensitive requirement, and how come folks aren't willing to accept the downsides of preserve as a workaround):

it is darn convenient to simply add it to a project and then not having to think about it anymore

And why is it fine that:

it doesn't fulfil that simple-and-convenient-albeit-not-100%-correct scenario

My reasoning for that: [JsonIgnore] is not feasible whenever you have a two-way relationship and you sometimes need to fetch either type of object (and you want the related objects with it, but not the self-references). Except for that there doesn't seem to be a good workaround?

Also, @cjblomqvist, can you please provide some code snippet/sample with the object model and desired behavior that showcases why the [JsonIgnore] attribute won't work? I would like to understand this clearly and am having some trouble. Appreciate it.

Regarding this:

There are elements of the Preserve that I like, although I could see incompatibilities between producers and consumers if they aren't aware of the same metadata properties and what they mean.

@butler1233, if you don't mind, can you dig a little deeper. Do you think that when producer/consumer have different serialization technologies, that it is OK to have "lossy" translation of objects with cycles into JSON (with cycles being ignored) and in some cases actually desired, rather than an oversight?

I couldn't serialize objects that were pulled by EF core. Talk about a surprise to me!! I even tried the ReferenceHandler.Preserve functionality of 5.0 rc1 and serialization still bombs out due to the circular references.

@hairlesshobo - can you please share the object you were trying to serialize. How come Preserve failed for you and what was the error? That might be a bug, unrelated to Ignore, or it may be that the object graph contained some unsupported type. Getting a repro to see the error would be really helpful to investigate the underlying issue.

It was an intentional design decision. Sometimes those are even harder to undo, since you need to revisit a decision that was already made, but that is exactly what we're doing here. Please help us make the right call this time around.

For reference, here's the API review discussion on ReferenceHandling and the Ignore feature that @ericstj is referring to:
#30820 (comment)

Video

We seem to lean towards not having Ignore -- it results in payloads that nobody can reason about. The right fix for those scenarios to modify the C# types to exclude back pointers from serialization.

And https://www.youtube.com/watch?v=H9zrbztep4M&feature=youtu.be&t=5890&ab_channel=.NETFoundation

Steve Cross · Answer 21 · Mon Oct 19 2020 20:50:27 GMT+0800 (China Standard Time)

@hairlesshobo - can you please share the object you were trying to serialize. How come Preserve failed for you and what was the error? That might be a bug, unrelated to Ignore, or it may be that the object graph contained some unsupported type. Getting a repro to see the error would be really helpful to investigate the underlying issue.

@ahsonkhan I will do some more in depth testing with "Preserve" in RC1 and get back with you. Off the top of my head, I do know that I use IPAddress types in my models and have to use a custom converter to handle (de)serialization, perhaps that is the cause? I will see if I can pin down the issue and try to put up a test repo. From what I recall, the error was , to paraphrase, "the maximum depth has been reached during serialization. Either increase the maximum depth or use Preserve" but I was already using Preserve, so I was really confused that it seemed to have no effect on serialization. I wouldn't be surprised if this was user error on my part.

Lee Butler · Answer 22 · Mon Oct 19 2020 21:09:16 GMT+0800 (China Standard Time)

@butler1233, if you don't mind, can you dig a little deeper. Do you think that when producer/consumer have different serialization technologies, that it is OK to have "lossy" translation of objects with cycles into JSON (with cycles being ignored) and in some cases actually desired, rather than an oversight?

Well I've actually just ran into an issue trying out preserve in a new project and seeing if I can use it. I have a service currently running on .net 5 rc2 which is serialising some EF objects on a request to a model. The service is consumed by another application (via RestSharp) running on .net core 2.2, using Newtonsoft (12) as it's deserialiser. It looks like I've already discovered incompatibilities:

The collection services contains 10 objects, which all have 3 navigation properties (some of which will navigate to the same record), a bool and a Guid. All 10 records have all 3 nav properties filled out in the database, and on the server side they're all correct. However, there is a clear issue with how it's being parsed on the other end (which might be down to different implementations of preserve between Newtonsoft and STJ;

[0] appears to be fine, with all of its properties filled correctly, and most of the children of them are also filled correctly.
[2] has the Service and Warehouse properties filled correctly, but does not have the Tier property filled, which it should do.
Going further down, [7] has got none of it's navigation properties filled out correctly.

I think it may be down to creating the isntances of the objects which appear multiple times, as all of the individual PostalService records, DespatchWarehouse records and ServiceTier records all appear somewhere in the list, but not more in more than one place.

Regardless of whether this is an issue with Json.NET, STJ, or something else, I think that having this glaring incompatibility between what is essentially different versions of the same stack (as STJ is the successor because it's the default JSON lib from .net 3, and before that it was Json.NET) is going to cause major problems for users who expect preserve to do as it's meant to.

Honestly, I went in to this hoping that Preserve would render some of my previous arguments valid and I'd be able to just use STJ in my new .net5 project that talks to a .net core 2.2 project.

Returning to the actual question you asked though - I think lossy cycling might be okay as long as it's understood that it will be happening - something you should be aware of when conciously enabling ReferenceHandling.Ignore - and where exactly it will happen. In the above instance, Ignore would have made it so that every entry in the list had it's navigation properties filled, but the objects inside the navigation properties likely wouldn't have the ServiceWarehouseTierEntrys that referred to them in the first place. However when using preserve, at least with this incompatibility, there are instances where I've lost crucial information which I would have expected to have otherwise.

I think that the way to approach this might be that if we don't want to introduce Ignore again, that Preserve is implemented in a way which is compatible with how Json.NET parses it.

EDIT: So it turns out that apparently Json.NET doens't have a Preserve option or equivalent. It has the following options (which do as descrived when asked to serialise the same data server side)

Serialize - Tries to continue cycling, ends up throwing a StackOverflowException and crashes the service.
Error - Unsurprisingly, errors: JsonSerializationException: Self referencing loop detected with type 'LabelService.Entity.ServiceWarehouseTierEntry'. Path '[0].warehouse.servicesTiers'.
Ignore - Serialised the EF data mostly correctly, and will follow the tree down as far as it can, ignoring anything it's already done on its way down. Notably this will allow the same object to be serialised multiple times within the entire payload, as long as it doesn't cause a cycle. While this can bloat the payload slightly, it's about as close as we can get to a correct (at least in terms of values) serialisation.

It may be that if you're happy for everything that consumes the service to require STJ, then preserve works fine, but for anything that can't handle the extra metadata to build the json back up into a proper object, Preserve won't work. That would seriously diminish the range of things that can consume the API.

I think as it's been addressed in this thread and in #30820, there probably isn't a perfect solution. Sure, Preserve works if support for the metadata it produces becomes widely supported, but I think at least for the transition period there still needs to be a solution that works for consumers which don't support it.

Further cheeky edit to say that I'm not entirely sure what else the Microsoft.AspNetCore.Mvc.NewtonsoftJson package adds aside from the ability to use the Newtonsoft serialisation (which I'm not sure why anyone would want to aside from some of the odd legacy features like Ignore) but I think the fact that it has (at the time of writing) 28.5M downloads from nuget is definitely something worth considering. I'm not really sure if there's any way of measuring the usage of Ignore specifically though.

Additionally to contribute to the [JsonIgnore] argument, I understand where he's coming from about that not working. In the above, if I started with ServiceWarehouseTierEntry and wanted to get the Warehouse and Service proeprties, I would use JsonIgnore to stop it from cycling with the children of the warehouse and service. However, if I started with a Warehouse and wanted to get the children, the existance of the [JsonIgnore] blocks that.

It works one way but doesn't work the other way, and as I could (and do) need to go both ways depending on the request, it wouldn't workout without building new DTOs for every request and having to map everything across.

Eric Sampson · Answer 23 · Tue Oct 20 2020 00:30:58 GMT+0800 (China Standard Time)

@butler1233 FIWIW, JSON.NET does have it - you may be missing setting the appropriate PreserveReferencesHandling enum value along with ReferenceLoopHandling = Serialize.

Lee Butler · Answer 24 · Tue Oct 20 2020 00:51:17 GMT+0800 (China Standard Time)

@butler1233 FIWIW, JSON.NET does have it - you may be missing setting the appropriate PreserveReferencesHandling enum value along with ReferenceLoopHandling = Serialize.

Ooh thanks for the heads up. I'll give that a try in the morning and report back with findings. I'm hoping to find a way to work without using Json.NET in this project.

Ahson Khan · Answer 25 · Tue Oct 20 2020 02:29:11 GMT+0800 (China Standard Time)

To add to @ericsampson, Newtonsoft.Json has three enums around this space. So, it may be that you also need to leverage the MetadataPropertyHandling enum as well when deserializing the payload that STJ wrote with metadata properties, especially if there are some ordering issues of the properties (possibly using ReadAhead).

@butler1233, can you please provide a simplified repro (anonymized if necessary) along with the actual/expected JSON produced on 3.1 on your POCO/model being serialized with STJ and the subsequent failure from Newtonsoft.Json on nc 2.2? Showing a code sample that folks can test and observe the behavior you are seeing, for themselves, would help make things clear. This will be really helpful to pin down the exact issues, because it is quite possible that the seemingly incompatible behavior is just a settings change that needs to be updated. And if that's the case, the STJ docs and porting guide can certainly be improved to ensure folks don't hit the pit of failure you are coming across.

For context, avoiding such complexity (several enums with lots of permutations) is one of the reasons why we insisted on a simplified and single entry point when it comes to reference handling in STJ, to make sure the behavior is consistent, exact scenario driven, and any user can reason about it.

Here's the matrix of features from Newtonsoft with the three enums provided, and how the semantics map to the STJ approach:
https://github.com/dotnet/runtime/blob/master/src/libraries/System.Text.Json/docs/ReferenceHandling_spec.md#compatibility

Additionally to contribute to the [JsonIgnore] argument, I understand where he's coming from about that not working. In the above, if I started with ServiceWarehouseTierEntry and wanted to get the Warehouse and Service proeprties, I would use JsonIgnore to stop it from cycling with the children of the warehouse and service. However, if I started with a Warehouse and wanted to get the children, the existance of the [JsonIgnore] blocks that.

It works one way but doesn't work the other way, and as I could (and do) need to go both ways depending on the request, it wouldn't workout without building new DTOs for every request and having to map everything across.

And does Newtonsoft.Json's Ignore semantics not suffer from the same issue? Wouldn't the order in which the object graph is traversed (based on the entry point), change the behavior and the JSON output with the ignore cycle feature too because it would depend on when/where the cycle was detected?

cjblomqvist · Answer 26 · Tue Oct 20 2020 05:03:18 GMT+0800 (China Standard Time)

@ahsonkhan I'm on cell phone so a little bit difficult to reply in detail. Anyway, it seems you need to read through my comments again. I believe I've quite clearly stated why Preserve doesn't suffice. You can't use Ignore because of the scenario outlined by @butler1233. As you can see from his scenario you're actually losing more data when using Preserve than Ignore (not the attribute). If you don't see that, please try and map out the scenarios yourself first. You can do it here and I can point out any mistakes.

As for my comment about "simple-and-convenient-albeit-not-100%-correct scenario" - you're reading it out of context. I mean that Preserve doesn't fit that scenario. It seems you're looking at this issue exactly from the opposite perspective - the 100% correct way (even if that way doesn't work well for some (critical scenarios).

As for scenarios, I've outlined some scenarios in previous comments in this our the previous ("original") thread/issue, including why Preserve isn't reasonable as a workaround (payload, incompatibility). Please check those to understand the issue better.

Fundamentally, Preserve adds data as well as changes the intuitive way you'd assume JSON to represent the data (as well as JS structure). This causes issues. The alternatives to Preserve doesn't work because of the fundamental issues with the Ignore attribute as outlined by @butler1233, unless you do a lot of extra code (as I've outlined previously with an example). Ignore has it's downside (silent data loss), but it turns out that that's actually not a big problem for a lot of cases.

Lee Butler · Answer 27 · Wed Oct 21 2020 01:14:28 GMT+0800 (China Standard Time)

@ahsonkhan I've made a sort of simplified reproducible. It's not simplified really, as it was fairly simple to start with (you can ignore a load of the other stuff that's going on, and actually replicating the issues is going to be a bit of a faff.

I've put everything in a repo here: https://github.com/butler1233/LabelServiceRepro

There's an info file in the Outputs folder with some information on reproducing the issues.

I've created a client which runs on 2.2 (ReproClient) which outputs the stuff prefixed Console in the outputs folder, along with the JSON payloads that the service spits out depending on what settings are used.

The key findings though are as follows:

Setting Json.NET up like PreserveReferencesHandling = All in conjunction with ReferenceLoopHandling = Serialize creates a byte-for-byte indentical JSON output at using STJ with the Preserve option.
Fiddling around with a variety of options on the Newtonsoft client (including setting the PreserveReferencesHandling=all just in case, as well as MetadataPropertyhandling=Readahead, I can't get the client to deserialise the json into the object properly.
Although Ignore appears to have the desired result on the client (being that all instances have all of the properties populated) , it is clear that the JSON payload is bloated (in this instance) 166x!, obviously this would very depending on what level of recurrence there is within the JSON, but I can definitely see why this isn't ideal either
From what I've tried, there doesnt appear to be any way to get the object to be infinitely traversible on the client like it would be able to be on the server.

That last point I think is the real gold standard, and I think if both the server and client were both using STJ and the Preserve loop handling (althouigh I haven't tested it for deserialising) then we would be absolutely sorted.

I think it may be that the Json.NET library is meant to be able to preserve the references, but something is blocking it from working. However, if that library can be sorted to reproduce a fully traversible object from the output of the Preserved JSON, I think we'd have a viable candidate for how to actually solve the issue instead of using lossy workarounds.

And does Newtonsoft.Json's Ignore semantics not suffer from the same issue? Wouldn't the order in which the object graph is traversed (based on the entry point), change the behavior and the JSON output with the ignore cycle feature too because it would depend on when/where the cycle was detected?

The issue with [JsonIgnore] vs RLH.Ignore is that RLH will always go at least one level down, regardless where in whatever graph you are (unless you have an object which refers to itself directly as a property). However with [JsonIgnore], you will only go as far down as you hit that attribute. Sounds like it's not the end of the world, but if the JsonIgnore attribute was use on the first type in the graph, you're instantly losing crucial data. Ignore is lossy that it will lose the data in some instances, although it will be in the payload somewhere. However JsonIgnore could result in irretrievably losing data as that instance may never be serialised at all.

I hope that makes sense. If not just let me know, but I think that everything is there. I've never tried to do soemthing like this with the different permutations so I've done my best, but it's all a bit wild.

David Cantú · Answer 28 · Wed Oct 21 2020 02:19:49 GMT+0800 (China Standard Time)

@butler1233 I'm seeing the following issues when building your projects:

ReproClient:

C:\Program Files (x86)\Microsoft Visual Studio\2019\Preview\MSBuild\Current\Bin\Microsoft.Common.CurrentVersion.targets(2128,5): warning MSB3245: Could not resolve this reference. Could not locate the assembly "LabelService.Connector". Check to make sure the assembly exists on disk. If this reference is required by your code, you may get compilation errors.
C:\Users\dacantu\Desktop\LabelServiceRepro-master\ReproClient\Program.cs(2,7,2,19): error CS0246: The type or namespace name 'LabelService' could not be found (are you missing a using directive or an assembly reference?)

LabelService:

C:\Users\dacantu\Desktop\LabelServiceRepro-master\LabelService\LabelService.csproj : error NU1101: Unable to find package FrameworkSharp. No packages exist with this id in source(s): Microsoft Visual Studio Offline Packages, nuget.org

Is FrameworkSharp a public nuget package?

Lee Butler · Answer 29 · Wed Oct 21 2020 03:11:13 GMT+0800 (China Standard Time)

Ah yes that was my mistake, that's from our private feed. I've updated the repo so it shouldn't have any external private dependencies now.

David Cantú · Answer 30 · Wed Oct 21 2020 04:50:33 GMT+0800 (China Standard Time)

@butler1233 Ok, now facing more issues:
When pinging https://localhost:44398/register/serviceoptions:

A network-related or instance-specific error occurred while esablishing a connection to SQL Server. The server was not found or was not accessible. Verify that the instance name is correct and that SQL Server is configured to allow remote connections. (provider: Named Pipes Provider, error: 40 - Could not open a connection to SQL Server)

Couldn't you just share the static JSON payload that is causing you troubles, along with the models/classes that you are using to deserialize and then we can try out deserialization in one single console app for net core 2.2?

Lee Butler · Answer 31 · Wed Oct 21 2020 05:03:57 GMT+0800 (China Standard Time)

Couldn't you just share the static JSON payload that is causing you troubles, along with the models/classes that you are using to deserialize and then we can try out deserialization in one single console app for net core 2.2?

@Jozkee I did share the Json payloads - they're included in the outputs folder of the repo. The reproclient project doesn't use them directly as I was testing out various scenarios and didn't get around to testing it directly with the payload in a file.

The project originally was backed by an SQL server database (using the connection string in appsettings.json) which you would need if you wanted to fully replicate the environment, but in the interests of convienience the JSON payloads that the service spits out in the different (server) scenarios are included. As I mentioned above though, the STJ ReferenceHandling.Preserve payload and the Json.NET PreserveReferenceHandling.All & ReferenceLoopHandling.Serialize payload are exactly the same, so it's just a case of trying to get the (smaller) payloads to deserialise using the metadata properly.

David Cantú · Answer 32 · Wed Oct 21 2020 06:27:10 GMT+0800 (China Standard Time)

@butler1233 I did a simple repro using Json - Service using STJ with ReferenceHandler.Preserve.json and none of the navigations props were left as null. I just want to verify that there is no issue here, at least on JSON side.

static void Main(string[] args)
{
    string json = File.ReadAllText(@"C:\Users\dacantu\Desktop\LabelServiceRepro-master\Outputs\Json - Service using STJ with ReferenceHandler.Preserve.json");
    List<ServiceWarehouseTierEntry> services = JsonConvert.DeserializeObject<List<ServiceWarehouseTierEntry>>(json);

    foreach (ServiceWarehouseTierEntry swtEntry in services)
    {
        Debug.Assert(swtEntry.Service != null);
        Debug.Assert(swtEntry.Warehouse != null);
        Debug.Assert(swtEntry.Tier != null);
    }

    var settings = new JsonSerializerSettings { PreserveReferencesHandling = PreserveReferencesHandling.All };
    string newJson = JsonConvert.SerializeObject(services, settings);
    Debug.Assert(json == newJson);

    var options = new JsonSerializerOptions { ReferenceHandler = ReferenceHandler.Preserve };
    newJson = System.Text.Json.JsonSerializer.Serialize(services, options);
    Debug.Assert(json == newJson);
}

dotnet version: netcoreapp2.1
Newtonsoft.Json version: 12.0.3
system.Text.Json version: 5.0.0-rc.2.20475.5

Lee Butler · Answer 33 · Thu Oct 22 2020 20:34:12 GMT+0800 (China Standard Time)

@Jozkee That's very odd. I've tried the same code above (except with the added output that I used to see all the results quickly) and it fails just as it did previously. My code is below and I'm using the versions specified underneath.

Highlights are the assertions in the foreach sometimes fail (similarly with the output having missing values), and the re-serialising and comparing to the old json assertion fails too. I really don't know what's different between them.

        static void Main(string[] args)
        {
            string json = File.ReadAllText(@"C:\GIT C\LabelServiceRepro\Outputs\Json - Service using STJ with ReferenceHandler.Preserve.json");
            List<ServiceWarehouseTierEntry> services = JsonConvert.DeserializeObject<List<ServiceWarehouseTierEntry>>(json);


            Console.WriteLine($"All entries below should have all 3 values. Some of the values should appear on multiple entries.: ");
            Console.WriteLine($"Recieved (in the following format): ");
            Console.WriteLine($"[Index]: Service.Name | Tier.TierName | Warehouse.Name ");

            for (int i = 0; i < services.Count; i++)
            {
               var line = services[i];
               Console.WriteLine($"[{i}]: {line.Service?.Name} | {line.Tier?.TierName} | {line.Warehouse?.Name}");
            }
            
            foreach (ServiceWarehouseTierEntry swtEntry in services)
            { 
                Debug.Assert(swtEntry.Service != null);
                Debug.Assert(swtEntry.Warehouse != null);
                Debug.Assert(swtEntry.Tier != null);
            }

            var settings = new JsonSerializerSettings { PreserveReferencesHandling = PreserveReferencesHandling.All };
            string newJson = JsonConvert.SerializeObject(services, settings);
            Debug.Assert(json == newJson);

        }

dotnet version: 2.1, 2.2, 3.0, 3.1
Newtonsoft.Json version: 12.0.1 (on 2.1, 2.2), 12.0.3 (on 2.1, 3.0, 3.1)

Eric Sampson · Answer 34 · Sat Nov 21 2020 07:07:13 GMT+0800 (China Standard Time)

@Jozkee , what are the current thoughts re this issue? It's gotten so long.
For me, what might be more useful than something equivalent to NewtonSoft's ReferenceLoopHandling.Ignore would to be create a new option, that only creates references where they're needed, unlike the current behavior of STJ ReferenceHandler.Preserve

David Cantú · Answer 35 · Sat Nov 21 2020 08:04:02 GMT+0800 (China Standard Time)

@ericsampson can you provide some sample payloads that you would expect to get by using the "new option" that you mention? I assume you are saying that is fine to write the metadata ($id $ref, and $values) as long as is only for the objects where a cycle is detected.

I think that approach is not possible for System.Text.Json given that it is forward-only so it cannot write some objects with $ids and some without it.

Eric Sampson · Answer 36 · Sat Nov 21 2020 22:34:17 GMT+0800 (China Standard Time)

@Jozkee yes you understand what I was suggesting. I was guessing that it might not be possible due to a forward-only design, but wanted to mention it anyway.

Instead, I'll try to provide a motivation for the use of Ignore, because that seems like we haven't heard that much here. So we have a central structured logging service where users can send arbitrary C# objects, and then our service writes them to several targets (Elasticsearch, APM, etc) and these targets often want the payload to be JSON. So we need to be able to transform arbitrary user-created objects whose structure is out of our control into JSON. It is not uncommon at all for these objects to contain circular references. In order to prevent exceptions when serializing to JSON, we use Ignore with Newtonsoft and hence have not been able to switch these services to STJ.

In this use case, the inability of Ignore to round-trip (which has been mentioned upthread as a reason to not offer this behavior) is a complete non-issue. We just need to be able to serialize the object into something human-readable in all cases without the serialization blowing up, even if there is some information loss - because the end consumer of the JSON is human eyeballs in Elastic/APM/etc. It's better for us to get some information to the user in the cases where they passed circular objects, than having to drop the user's records entirely.
Does that help?

David Cantú · Answer 37 · Mon Nov 23 2020 23:13:38 GMT+0800 (China Standard Time)

@ericsampson yes that helps. I envision that in order to close this issue we will provide an Ignore option equivalent to Newtonsoft's given the popular demand.

Eric Sampson · Answer 38 · Tue Nov 24 2020 01:09:41 GMT+0800 (China Standard Time)

Thanks David - there was just so much heat around the 'why is this useful' earlier, I wanted to try and give some different usecases :)

In an ideal world I'd prefer to have an 'emit only needed references' option and acknowledge the performance hit, but I can see how that might take a lot of work given the way the codebase has been structured. I guess I could write some sort of post-processing step of the current 'preserve' behavior, though that would be a decent amount of work to write.

Thanks for continuing the dialogue!
Cheers

stevozilik · Answer 39 · Wed Nov 25 2020 02:15:35 GMT+0800 (China Standard Time)

Somewhere I've read that feedback is desired:
Just another customer using ReferenceLoopHandling.Ignore. Seems like a must have feature, common sense given the aim is to provide best serialization experience?

Also please correct the documentation. There is no equivalent for ReferenceLoopHandling in System.Text.Json, why mislead?
https://docs.microsoft.com/en-us/dotnet/standard/serialization/system-text-json-migrate-from-newtonsoft-how-to?pivots=dotnet-5-0#preserve-object-references-and-handle-loops

David Cantú · Answer 40 · Wed Nov 25 2020 03:38:55 GMT+0800 (China Standard Time)

@stevozilik can you point/explain to me what's the misleading part?

stevozilik · Answer 41 · Wed Nov 25 2020 21:03:04 GMT+0800 (China Standard Time)

@Jozkee happy to

can you point/explain to me what's the misleading part?

Docs suggest ReferenceLoopHandling config option is available in System.Text.Json:
ReferenceLoopHandling global setting | ✔️ ReferenceHandling global setting

class CircularRecord
{
    public CircularRecord Reference { get; set; }
}

var recA = new CircularRecord();
var recB = new CircularRecord();
recA.Reference = recB;
recB.Reference = recA;

Newtonsoft using ReferenceLoopHandling.Ignore
{"Reference":{}}

System.Text.Json using ReferenceHandler.Preserve (another config option that I have missed?)
{"$id":"1","Reference":{"$id":"2","Reference":{"$ref":"1"}}}

Completely different output.

If the guidance is I can code it myself implementing ReferenceHandler, that's like going to a shop that is advertising selling Orange Juice, and all they actually have is Oranges - fair/misleading?

cjblomqvist · Answer 42 · Sat Dec 05 2020 04:01:50 GMT+0800 (China Standard Time)

I guess #30820 (comment) is a quite funny comment, and probably also highlights just how natural Ignore is (considering MS EF documentation suggests it as the number one solution). I guess EF is a quite relevant library with a lot of use cases, so that should guide the relevancy discussion a little 😊

Alberto Passalacqua · Answer 43 · Sat Dec 05 2020 04:16:37 GMT+0800 (China Standard Time)

@cjblomqvist it meant to be funny and also show a bit of frustration behind all the IMHO over-thinking in this discussion, which led to postponing Ignore to 6, one year from now, despite it being obviously needed in some use cases. 😄

Immo Landwerth · Answer 44 · Wed Jan 06 2021 04:34:01 GMT+0800 (China Standard Time)

Video

The concern would be ensure that the output is deterministic regardless how we're traversing the object graph. Is this guaranteed? If that's the case, this feature seems reasonable.
When an a list contains a cycle, we should write out null value, instead of just omitting the value (as that's consistent with the ignore-null-behavior and doesn't change indices).

namespace System.Text.Json.Serialization
{
    public partial class ReferenceHandler
    {   
        // Existing:
        // public static ReferenceHandler Preserve { get; }
        public static ReferenceHandler IgnoreCycle { get; }
    }
}

Eric Sampson · Answer 45 · Wed Jan 06 2021 07:59:20 GMT+0800 (China Standard Time)

@terrajobst FWIW here's our usecase for this functionality, just because there was a number of questions about this on the video:

"So we have a central structured logging service where users can send arbitrary C# objects, and then our service writes them to several targets (Elasticsearch, APM, etc) and these targets often want the payload to be JSON. So we need to be able to transform arbitrary user-created objects whose structure is out of our control into JSON. It is not uncommon at all for these objects to contain circular references (e.g. due to Entity Framework). In order to prevent exceptions when serializing these arbitrary user-created objects to JSON, we use Ignore with Newtonsoft and hence have not been able to switch these services to STJ.

In this use case, the inability of Ignore to round-trip (which has been mentioned upthread as a reason to not offer this behavior) is a complete non-issue. We just need to be able to serialize the object into something human-readable in all cases without the serialization blowing up, even if there is some information loss - because the end consumer of the JSON is human eyeballs in Elastic/APM/etc. It's better for us to get some information to the user in the cases where they passed circular objects, than having to drop the user's records entirely."

Steve Harter · Answer 46 · Thu Jan 14 2021 23:41:14 GMT+0800 (China Standard Time)

This feature seems specific to a handful of related but somewhat different scenarios. It may be safer and more flexible to improve the existing extensibility model to support ignore and provide various samples around that.

The concern would be ensure that the output is deterministic regardless how we're traversing the object graph. Is this guaranteed? If that's the case, this feature seems reasonable.

I was under the assumption that the behavior could be non-deterministic in some cases, but that doesn't appear to be the case at least with Newtonsoft. Ignore only applies to the exact moment a cycle is detected. Thus:

A given object can be serialized more than once.
Cycles are broken by detecting when a child object is a reference to a parent (which is already being serialized).

with the result being a cyclic graph is converted into a noncyclic graph, assuming a root of course.

Here's a Newtonsoft test of a A, B and C nodes where they all reference each other with A being the root.

        static void Main(string[] args)
        {
            // Uncommenting the line below changes reflection order.
            // typeof(Node).GetProperty("Ref2");

            var a = new Node { Name = "a" };
            var b = new Node { Name = "b" };
            var c = new Node { Name = "c" };

            a.Ref1 = b;
            a.Ref2 = c;

            b.Ref1 = a;
            b.Ref2 = c;

            c.Ref1 = a;
            c.Ref2 = b;

            string json = JsonConvert.SerializeObject(a, Formatting.Indented, new JsonSerializerSettings
            {
                ReferenceLoopHandling = ReferenceLoopHandling.Ignore,
            });

            Console.WriteLine(json);
        }
    }

    public class Node
    {
        public string Name { get; set; }
        public Node Ref1 { get; set; }
        public Node Ref2 { get; set; }
    }

Output:

{
  "Name": "a",
  "Ref1": {
    "Name": "b",
    "Ref2": {
      "Name": "c"
    }
  },
  "Ref2": {
    "Name": "c",
    "Ref2": {
      "Name": "b"
    }
  }
}

B and C are serialized twice, but the root A is only serialized once. So B->A and C->A are ignored since in those cases the "child" node A was already serialized as a parent.

If the reflection order changes (uncomment the line that mentions this) the data is still the same, just in different ordering:

{
  "Ref2": {
    "Ref2": {
      "Name": "b"
    },
    "Name": "c"
  },
  "Name": "a",
  "Ref1": {
    "Ref2": {
      "Name": "c"
    },
    "Name": "b"
  }
}

Steve Harter · Answer 47 · Thu Jan 14 2021 23:44:23 GMT+0800 (China Standard Time)

I assume the design will cover:

Value type semantics (do we use Equals?)
Reference type semantics on whether Equals() vs. ReferenceEquals() is used.
- I assume we want ReferenceEquals() since that is what we use for preserve reference handling.

David Cantú · Answer 48 · Tue Feb 02 2021 06:10:34 GMT+0800 (China Standard Time)

Sending back to api-ready-for-review in order to discuss a more suitable name for the API:

1. IgnoreCycle: Original proposal.
2. IgnoreCycles: From #46101 (comment) one could read Preserve as PreserveDuplicates, which would be plural too.
3. BreakCycles: Given that we are breaking cycles by emitting the JSON null token and to be clear about the behavior diverging from Newtonsoft.
4. EmitNullForCycles: From #40099 (comment).

Eric Sampson · Answer 49 · Tue Feb 02 2021 06:50:44 GMT+0800 (China Standard Time)

What about something like

EmitNullForCycles

David Cantú · Answer 50 · Tue Feb 02 2021 07:05:50 GMT+0800 (China Standard Time)

@ericsampson sounds like a reasonable name that exactly describes what the API does.

Jeremy Barton · Answer 51 · Fri Feb 05 2021 01:59:06 GMT+0800 (China Standard Time)

Changed to plural via email:

namespace System.Text.Json.Serialization
{
    public partial class ReferenceHandler
    {   
        // Existing:
        // public static ReferenceHandler Preserve { get; }
        public static ReferenceHandler IgnoreCycles { get; }
    }
}