Remove BinaryFormatter from the shared framework in .NET 5

Question

Remove BinaryFormatter from the shared framework in .NET 5

GrabYourPitchforks opened this issue 5 years ago · comments

We've known for a long while that BinaryFormatter is a dangerous type and that it's responsible for many of the security vulnerabilities real-world .NET applications have seen in recent years. In .NET Core 1.x, the type was removed from the product entirely, and it was brought back in .NET Core 2.0 under the guise of "app compat".

Though as of the time of this writing there are no known types in .NET Core which can be leveraged by BinaryFormatter to achieve RCE, it is impractical to make this guarantee going forward. As new types are introduced (or existing types from the full Framework are re-introduced) I have no doubt that one of them will contain such a vector that we weren't able to catch during development due to the complexity of this system.

Some applications use a custom SerializationBinder to mitigate attacks both on .NET Core and in full Framework. The engineering team's current understanding is that a properly-written SerializationBinder is sufficient to block RCE attacks. However, this is not a guarantee, and even so, a custom SerializationBinder cannot mitigate other attacks like DoS within the BinaryFormatter infrastructure itself.

All of this said, the conclusion is that BinaryFormatter is not an appropriate serialization format for application developers to be using in a cloud-connected world. It was a mistake for us to bring it back in .NET Core 2.0, as us re-introducing it into the shared framework gives the type an undeserved air of respectability, like we're nodding and implicitly approving its usage by modern applications.

To address this, and to eventually migrate applications off of BinaryFormatter, I propose the following steps. These steps will only affect customers who target .NET 5 or above for their applications.

Remove BinaryFormatter from the shared framework and into its own standalone package that must be manually retrieved and installed into the application.
Mark the BinaryFormatter type itself with [Obsolete], or include in the package a code analyzer which warns on the type's usage. Force developers to explicitly acknowledge that they're calling an extremely dangerous API and that they must be certain the data is coming from a trusted location.
Document the BinaryFormatter code base as legacy only. Clearly communicate that Microsoft is making no further investments in this code base, and all bug reports or feature requests will be immediately closed.
Create a separate "safe" (type-limiting, bounds-checking, etc.) serializer whose payload format is BinaryFormatter-compatible and which can be used to deserialize existing payloads in a safe fashion, even if those payloads originate from an untrusted source. The API surface will look different than the traditional BinaryFormatter.Deserialze call due to needing to configure the serializer on a per-call basis, but at minimum it gives applications a way to read existing payloads. This allows them to migrate their data without leaving them hanging. This serializer could be in the same package that BinaryFormatter is in.

These steps would serve to modernize the runtime and SDK by eliminating a type which has no place in today's development cycle, it serves as a forcing function to get application developers to fix potential vulnerabilities in their code, and it provides on an opt-in basis a way for applications to continue to work if they really must continue using the type going forward.

Barry Dorrans · Answer 1 · Sat Jun 22 2019 03:04:11 GMT+0800 (China Standard Time)

Barry Dorrans commented 5 years ago

Symbai · Answer 2 · Sat Jun 22 2019 03:08:03 GMT+0800 (China Standard Time)

BinaryFormatter has the advantage that it's incredibly fast. Faster than any other formatter, by a lot. If you consider to remove it, please consider to provide a reasonable alternative, which is fast and secure. If you do, then I don't see any reason to not agree with that, regardless whether my personal opinion counts or not.

Nick Craver · Answer 3 · Sat Jun 22 2019 03:15:54 GMT+0800 (China Standard Time)

@Symbai I think a very important thing to keep in mind here is that BinarySerializer is not fast and secure. It is fast largely because it is not secure.

Levi Broderick · Answer 4 · Sat Jun 22 2019 03:17:00 GMT+0800 (China Standard Time)

@Symbai If you're looking for something where the payload format is human-readable, check out the work taking place in this repo in the System.Text.Json namespace. Or if you want pure performance over human readability, you may want to check out Google's ProtoBuf project, which has C# APIs.

In the proposal I have in this issue, I do suggest that we should create a BinaryFormatter-compatible "safe" deserializer, but performance wouldn't be a driving factor of such a serializer. (Though we wouldn't do anything to intentionally tank performance.) One of the other formats might be more appropriate for your needs given this.

Chris McKee · Answer 5 · Sat Jun 22 2019 03:21:00 GMT+0800 (China Standard Time)

Pushing it into its own package seems a reasonable compromise.

TeBeCo · Answer 6 · Sat Jun 22 2019 06:14:06 GMT+0800 (China Standard Time)

is it a good idea to refer to benaadams/System.Ben#107
for "fast and secure" binary formatter ?

Too soon ?

samsosa · Answer 7 · Sat Jun 22 2019 18:58:16 GMT+0800 (China Standard Time)

Or if you want pure performance over human readability, you may want to check out Google's ProtoBuf project, which has C# APIs.

Than maybe replace the BinaryFormatter implementation under the hood with ProtoBuf?

Viktor Hofer · Answer 8 · Mon Jun 24 2019 02:31:31 GMT+0800 (China Standard Time)

I'm sure you are considering partners here which are relying on BinaryFormatter in .NET Core. Keep in mind to loop them in whenever you are starting this effort.

Marc Gravell · Answer 9 · Tue Jun 25 2019 02:16:57 GMT+0800 (China Standard Time)

@samsosa just to be explicit re:

Than maybe replace the BinaryFormatter implementation under the hood with ProtoBuf?

That isn't a realistic option. Protobuf is a powerful tool, but it has a different feature-set to BinaryFormatter, and there is no good way of making one pretend to be the other, in either direction. Just like XmlSerializer, Json.NET etc are all fundamentally serializers, but don't have he exact same feature set. If people want to use protobuf, that's fine, but they'd need to do that knowingly and deliberately.

Matt Connew · Answer 10 · Tue Jun 25 2019 02:21:40 GMT+0800 (China Standard Time)

I would like to try an put a stop to the trend I'm seeing in multiple issues of ProtoBuf being the solution to all things serialization. It is not a good serialization format, and it was never designed to be one. It is designed to be a compact way to represent a limited set of features for sending an object graph to a gRPC endpoint. It has very severe limitations when trying to be used for general purpose serialization.

It doesn't have support for references so if an object appears in an object graph multiple times, it will be completely serialized multiple times. We've had customers get tripped up by this before when they overrode Equals or GetHashCode incorrectly and ended up with two separate objects on deserialization which caused application logic problems.

The lack of references means it can't handle cycles in your object model which will cause infinite recursion on serialization. It simply cannot handle that scenario at all.

There have been attempts to come up with a way to support references to allow for these two scenarios. The one that protobuf.net does needs explicit opt-in for the types/members which need this support. This requires understanding your entire object model and know where cycles might exist and when having duplicates will be a problem. This could be a LOT of work depending on how complex your data model is. It greatly complicates your on-wire representation and none of the benchmarks I've seen have been against an OM using these features. Any benchmarks on size and speed of serialization need to take this into account. My understanding is that the Google implementation of gRPC serialization doesn't support this feature and expects you to support this mechanism at your OM level and not at the serializer level.

As the response from the ProtoBuf team has been that they aren't going to build this capability into the protocol and go implement your own solution, you lose the interoperability that ProtoBuf is supposed to bring as there will always be competing ways to do this by different ProtoBuf libraries.

Protobuf is a bad format for use with an existing set of classes. It's best use is when designing a new OM with these limitation in mind. E.g. I'm creating a new gRPC endpoint and so define my data model using IDL and then convert that to classes. Going the reverse direction is going to be problematic for many people as Protobuf is not a general purpose serialization format.

Incidentally, I read a blog post comparing MessagePack, ProtoBuf, JSON.NET and DataContractSerializer comparing size of output and speed of serialization and deserialization where it showed DataContractSerializer to be the worst on both fronts. I cloned their code and added a variation for DataContractSerializer using a binary XmlDictionaryReader/Writer and the size was about 10% bigger than ProtoBuf, and the time to serialize and deserialize was about 10% slower than ProtoBuf. If you were to have multiple duplicate objects and you turned on the reference support in DCS, you could easily create a case where DCS is both faster and smaller than ProtoBuf.

Levi Broderick · Answer 11 · Tue Jun 25 2019 03:51:31 GMT+0800 (China Standard Time)

@mconnew Thank you for the excellent analysis! You're right - I should have been more careful with my wording in my earlier comment. I didn't intend to suggest that ProtoBuf is The One True Replacement(tm) going forward, just that it's one of many options available to developers depending on their scenarios.

I think part of the solution is that we need to convince the ecosystem that a general purpose serializer that can handle arbitrary object graphs is not a sustainable concept. Serializers like DataContractSerializer handle this to a limited extent, but they also place restrictions on the shape the graph must take and limits on what deserialization looks like. Even support for cyclic object graphs is inherently unsafe unless the application explicitly expects to deserialize such cycles, as DataContractSerializer allows on an opt-in basis.

Whether this is a documentation issue that gives best practices for creating DTOs which can be safely serialized, whether we recommend switching to a safe but restricted-capability serializer like DataContractSerializer, or whether we take some other course of action are all up in the air. (It's also possible that we do nothing, but clearly I'm opposed to that idea since I opened this issue. 😊)

Another option - though one that might be more difficult to swallow since it'll require significant work within the ecosystem - would be to create a set of educational guidelines that explains in detail the concepts behind serialization, trust boundaries, and how the two interact. This would make individual developers responsible for analyzing their specific scenarios and choosing appropriate data models and serialization formats which are appropriate for their needs, while also giving them the knowledge to make trade-offs when necessary. I'm not a fan of this solution by itself because it presents a steep learning curve for our audience and goes against .NET's general premise of creating usable frameworks. But perhaps these educational guidelines could be useful in conjunction with another more approachable solution.

Matt Connew · Answer 12 · Tue Jun 25 2019 03:59:24 GMT+0800 (China Standard Time)

@GrabYourPitchforks, my comment was in response to my seeing multiple places where people were suggesting using ProtoBuf for serialization. It's NOT a serialization format and I just wanted to make sure I put my thoughts out there somewhere contextually relevant.
I believe that DataContractSerializer does about the best job it's possible to do with serializing. We could do with adding some helpers to enable the use of the binary XmlDicationaryReader/Writer as the dictionary needs to be transmitted OOB otherwise you pay the high cost of the Text output.

Marc Gravell · Answer 13 · Tue Jun 25 2019 14:36:30 GMT+0800 (China Standard Time)

@mconnew I don't want to distract from the conversation about BinaryFormatter, so I don't suggest "here", but I'd love to have an offline conversation with you about what does and doesn't qualify as a "serialization format" and why; it seems we have some different ideas there, and I'd love to hear a new perspective. I'm easy to get hold of if you're willing :)

Marco Rossignoli · Answer 14 · Tue Jun 25 2019 15:39:26 GMT+0800 (China Standard Time)

but I'd love to have an offline conversation with you about what does and doesn't qualify as a "serialization format" and why;

IMO would be useful for all devs have an "open" discussion/listening on it...all devs have to serialize something sometimes in career and the hide issues and choices are not so trivial.
Maybe a linked "discussion" to this issue.
Not mandatory though.

Marc Gravell · Answer 15 · Tue Jun 25 2019 19:55:35 GMT+0800 (China Standard Time)

@MarcoRossignoli great idea; I've dropped a discussion here; I'd love your input there if you have the time, @mconnew

Tobias Käs · Answer 16 · Tue Jul 02 2019 04:56:13 GMT+0800 (China Standard Time)

BinaryFormatter is an established format to transfer data through the clipboard and during drag'n'drop operations and likely to be in use in existing Desktop Framework applications. If you rip that out of the .NET Core framework without a replacement that can produce/consume the same binary data then .NET Core 5 applications will be unable to interact with Desktop Framework applications through drag'n'drop/clipboard when custom formats are in use.

Levi Broderick · Answer 17 · Mon Jul 15 2019 00:27:08 GMT+0800 (China Standard Time)

Thanks @weltkante for your feedback! Your point that there needs to be some replacement is well-taken, and I think the proposal outlined here addresses the issue. Let me elaborate.

The proposal here is to remove BinaryFormatter from the shared framework specifically. The consequence is that any component that is part of the shared Framework (such as WinForms or WPF) or any component that references only the shared Framework without any additional external NuGet packages will not be able to perform arbitrary object deserialization via BinaryFormatter. In fact, both WinForms and WPF have already made strides in this regard (e.g., dotnet/wpf#1132) to disallow arbitrary object deserialization while still retaining support for deserializing very specific allowed types.

In the scenario you describe, even though the Framework itself won't have default built-in support for deserializing arbitrary objects on clipboard / drag-and-drop operations, the application can continue to support the legacy behavior via hooking the clipboard / drag-and-drop handlers. For example, if the application hooks the "paste" handler and determines that the incoming data is a BinaryFormatter-formatted payload, the application's custom handler can call into the deserializer directly. I conceptualized a "safe" BinaryFormatter-payload compatible deserializer earlier in this issue, and I'd encourage application developers to use that deserializer in place of the legacy deserializer whenever possible. But in the end, even though we may discourage it, nobody will prevent application code from invoking the old unsafe arbitrary object deserialization routine. The difference going forward is that the Framework itself shall not call into the unsafe deserializer on the application's behalf; the app developer explicitly needs to write that code.

This should provide a viable outline for applications which absolutely need to retain the existing Full Framework behaviors to be ported to .NET Core.

Tobias Käs · Answer 18 · Mon Jul 15 2019 01:20:19 GMT+0800 (China Standard Time)

Ok, I see where you are going with that, moving it into a separate component may work. However note that just deserialization won't be enough for all compat scenarios, interacting with Desktop Applications through clipboard or drag'n'drop may also require serialization (e.g. if you want to put something into the clipboard that the existing Desktop Application should consume, or drag something onto a Desktop Application Window).

Harald Mühlhoff · Answer 19 · Mon Oct 28 2019 23:56:38 GMT+0800 (China Standard Time)

Just a quick reminder that new client / server applications that cost millions to develop and will have to be maintained for more than a decade are currently being built using BinaryFormatter. Even in times of "assume breach" using BinaryFormatter over a secured internal network connection has it's merits and the stability of the API surface is crucial for the credibility of the .NET platform.

Moving the BinaryFormatter to a separate Nuget package in .NET 5 maybe an option but

Document the BinaryFormatter code base as legacy only. Clearly communicate that Microsoft is making no further investments in this code base, and all bug reports or feature requests will be immediately closed.

goes too far from my point of view. Especially as it's not completely clear that a viable (=feature-complete; drop-in - with some configuration) more secure replacement that will handle serialization AND deserialization will be available.

Currently we're using .NET Framework on the client (MvvmCross, WPF) and .NET Core 2.2 (ASP.NET Core) on the server planning to migrate both to .NET Core 3. Having to live with the fear (=project risk) that Microsoft may "drop" / declare obsolete APIs just added to their newest platform is not a great outlook.

Still I appreciate that this discussion is happening in the open and customers can participate and hopefully be heard ;)

Levi Broderick · Answer 20 · Tue Oct 29 2019 02:12:03 GMT+0800 (China Standard Time)

@HaraldMuehlhoffCC Thank you for your feedback. Your comments hit on a few key points which I think are prime examples of why I had proposed this work item to begin with.

You are correct in that new applications (in 2019!) are being built with BinaryFormatter dependencies. IMO this is in large part due to the fact that Microsoft as a whole has not presented a unified message saying "do not use this feature going forward." Yes, there's the occasional blog post or security conference where we mention the problems, and the official docs discourage its use. But out of our entire developer ecosystem only a small fraction are tuned into those channels. The proposal here is intentionally disruptive, forcing developers who are using this type (or who want to use it) to take explicit action to continue using it in future versions. The acknowledgement process doesn't need to be difficult. Perhaps the process is as easy as pulling down another nuget package into the application - that's only a few button clicks or a single NuGet Package Manager command. Or perhaps it's adding a particular line to the .csproj file.

Even in times of "assume breach" using BinaryFormatter over a secured internal network connection has it's merits

As a security professional, I strongly disagree with this statement. This might be an appropriate assumption for your application today, as perhaps you control both the client and the server in tightly restricted environments. But as the application evolves, it's almost certain that this assumption might not hold going forward. Want to move your service to the cloud? Want to deploy the client on customers' workstations? Want to add the ability for additional less-authenticated users to access the application (perhaps in a read-only mode)? These processes are all now significantly more difficult because the application protocol is based on the insecure BinaryFormatter format.

Another way to think about this is to ask the following questions: "Would the server be comfortable allowing the client to upload an arbitrary .exe to the server and run it as part of request processing? Would the client be comfortable allowing the server to transmit an arbitrary .exe and run it as part of response processing? Is it acceptable for this behavior to be a permanent part of the application's request/response protocol?"

I would wager that vanishingly few of our developer audience would answer affirmatively to all three questions. But because our messaging on BinaryFormatter has been somewhat weak, our developers by and large don't realize that this is what they're signing up for. In my mind that's far more perilous for the ecosystem than any API adjustment would be.

Harald Mühlhoff · Answer 21 · Tue Oct 29 2019 03:03:06 GMT+0800 (China Standard Time)

@GrabYourPitchforks Thanks for your comment. We have indeed control both of the client and the server and it's running in a tightly restricted environment (electric utility companies). And this won't change for the client I'm talking about. Our web client is of course using a different protocol (JSON).

I don't quite get your running arbitrary code analogy. In a sense all protocols use some binary encoding (in the JSON & XML cases it's just UTF-8 ...). Just because the BinaryFormatter isn't using a human readable format doesn't mean that it's creating some sort of executable. Or is there some dark magic inside the implementation generating and executing IL? But still then it would be possible to add sanity / security checks ... this was the whole point about IL.
Also we're not dependant on the exact details of the binary format as we don't store it. So if there is something wrong with the implementation details of the current encoding we'd be happy to switch to another Formatter AS LONG AS it would be feature-compatible and not much slower.

Finally my primary concern with your proposal was

Document the BinaryFormatter code base as legacy only. Clearly communicate that Microsoft is making no further investments in this code base, and all bug reports or feature requests will be immediately closed.

How does this make the systems that will still be relying on BinaryFormatter any safer? Especially the "all bug reports [...] will be immediately closed" part ...

Barry Dorrans · Answer 22 · Tue Oct 29 2019 03:07:44 GMT+0800 (China Standard Time)

Just because the BinaryFormatter isn't using a human readable format doesn't mean that it's creating some sort of executable.

I'm afraid you're incorrect. It's not a matter of human readable versus binary, it's how the serialization works. BinaryFormatter puts the class information inside the serialized data. When it's deserialized that information is used to create the objects embedded in the file. This is dangerous, as there are classes within .NET which can be abused to deliver code execution. In fact json.net has the same option, but it's off by default. In BinaryFormatter we can't turn it off at all.

There have been numerous presentations about this at various hacker conferences, the seminal work being Are you my type? and the best recent one is Friday the 13th JSON Attacks.

Marc Gravell · Answer 23 · Tue Oct 29 2019 04:00:56 GMT+0800 (China Standard Time)

we'd be happy to switch to another Formatter AS LONG AS it would be feature-compatible and not much slower.

It is hard to comment on "feature-compatible" here without talking about what features you're using, as several of BinaryFormatter's "features" are really anti-features, but: you might find it relatively easy and performant to try protobuf-net (I'm hugely biased, disclosure blah blah). I should note that protobuf-net *also* had a similar optional / opt-in "bake your type meta into the data" feature, but I'm killing it hard in V3 for multiple reasons including the security aspect and the fact that a lot of types have moved around in .NET Core. Fortunately protobuf has a related "any" concept, which I'll be implemented very soon.

Barry Dorrans · Answer 24 · Tue Oct 29 2019 04:23:25 GMT+0800 (China Standard Time)

And if you want to see examples of binary formatter abuse in action; https://youtu.be/kz7wmRV9xsU?t=1273

Harald Mühlhoff · Answer 25 · Tue Oct 29 2019 04:51:59 GMT+0800 (China Standard Time)

Thank you, @blowdart and @mgravell! I get that security is not free & I've been actively following quite a few security folks on Twitter. Nevertheless for our very large object model I wouldn't necessarily want to add as much metadata as protobuf-net seems to require (even though we generate part of our model via a custom ORM and could generate attributes for some of the classes programmatically).

We also do some clever stuff serializing LINQ-Queries using custom serializable Expression trees; even serializing Types (via their names & structural info in some cases). But we make sure on the server that the resulting queries are safe.

I would prefer some sort of whitelisting the classes over all this verbosity. Actually shouldn't I be able to do exactly this in a custom ISurrogateSelector implementation (or custom SerializationBinder)?!

Also we control the server, the client, the network and secure the communication channel; so I'm still not completely sold that the security implications are so pressing in our case.

And as other formatters can be made unsafe by adding types / classes (see https://www.newtonsoft.com/json/help/html/SerializationSettings.htm#TypeNameHandling) BinaryFormatter can be made safe(er) by controlling the (de)serialized types.
So from my point of view there's nothing inherently "evil" with regards to BinaryFormatter.

Barry Dorrans · Answer 26 · Tue Oct 29 2019 04:56:43 GMT+0800 (China Standard Time)

As soon as we start adding limitations it's no longer compatible. We can't break the entire world like that. And surrogates have their own problems with DoS and OOM attacks.

Harald Mühlhoff · Answer 27 · Tue Oct 29 2019 05:11:54 GMT+0800 (China Standard Time)

@blowdart "With great power comes great responsibility" ;)

The currently available API should make it possible to add whitelisting types if required; see public virtual ISerializationSurrogate GetSurrogate(Type type, StreamingContext context, out ISurrogateSelector selector) in ISurrogateSelector (or SerializationBinder - whichever works best).

I'm with you that it's a good idea to make using the BinaryFormatter a conscious choice; even maybe adding whitelisting as a (changeable) default in .NET 5. I'm not seeing how this would be no longer compatible... (changes would have to be made anyway to reference an external Nuget package).

But forcing your customers to roll their own solutions won't make the world safer either.

Thanks everyone for the enlightening discussion; I have to get a few hours sleep as I have to get up quite early over here in Germany.

Barry Dorrans · Answer 28 · Tue Oct 29 2019 05:14:40 GMT+0800 (China Standard Time)

Moving it to a separate package doesn't mean you roll your own, you just have to opt-in to using it, with suitable dire warnings.

And then point dotnet/corefx#4 in Levi's plan would get you a safer one anyway.

Harald Mühlhoff · Answer 29 · Tue Oct 29 2019 05:28:43 GMT+0800 (China Standard Time)

@blowdart Then just keep it supported (or at the very least hand it to the community like WCF) ... otherwise customers would still be required to "roll their own" or use some other formatter - that wouldn't necessarily be safer - once critical (security) bugs were discovered. (I'm not talking about the architectural shortcomings you've written about)

Document the BinaryFormatter code base as legacy only. Clearly communicate that Microsoft is making no further investments in this code base, and all bug reports or feature requests will be immediately closed.

I will look into adding whitelisting for our BinaryFormatter usage utilizing the existing APIs. So hopefully something good will come from this discussion! ;)

Levi Broderick · Answer 30 · Tue Oct 29 2019 05:37:28 GMT+0800 (China Standard Time)

I will look into adding whitelisting for our BinaryFormatter usage utilizing the existing APIs. So hopefully something good will come from this discussion! ;)

Per @blowdart's earlier comment (and my original description in the issue), surrogates have their own problems that won't always protect you.

The simple and unfortunate fact is that the BinaryFormatter type is unfixably insecure. Full stop.

That fact is what led to this issue being opened in the first place. Most customers aren't aware that the above sentence explains the state of the world. Hence my desire for somewhat drastic measures to make them aware of this, to migrate existing code away from BinaryFormatter, and to institute within their organizations a blanket ban on using the type in all new code.

Tobias Käs · Answer 31 · Tue Oct 29 2019 05:49:03 GMT+0800 (China Standard Time)

Blanket bans aren't going to fix interoperability with existing clipboard/drag'n'drop usage. Customers are demanding that they can copy data/interact between applications, as far as I'm concerned you can move it into a separate package but BinaryFormatter can't go away as long as classic Desktop Framework WinForms/WPF client applications are around and popular.

Also I've yet to see a suggestion for replacing BinaryFormatter in new applications clipboard/dragdrop scenarios. WinForms/WPF repos don't even have a API plan for .NET 5 encouraging developers to move away from this and to some JSON or whatever else format, BinaryFormatter is still the default in .NET Core here, so all new .NET Core 3.0/3.1 applications still will be based on it. Note in particular that 3.1 is LTS and will have no alternative for BinaryFormatter.

Harald Mühlhoff · Answer 32 · Tue Oct 29 2019 06:17:46 GMT+0800 (China Standard Time)

@GrabYourPitchforks FTP in itself is also "unfixably" insecure with it's cleartext password but can be made more secure by using additional measures like https:.

As we have trusted clients, trusted servers, secured communication channels (only trusted clients can connect to the servers) and secured networks it's almost as if I'm locally passing data between classes. So in our case even passing code might be ok - though it's not planned. I understand that in other scenarios this is often not the case.

I will continue to follow this thread to see how things will unravel ;)

Justin Adler · Answer 33 · Tue Oct 29 2019 06:59:54 GMT+0800 (China Standard Time)

@HaraldMuehlhoffCC You're really really _really _ hell bent on using BinaryFormatter, aren't you? The discussion includes a few really knowledgeable people in this field and they all have given you numerous reasons why BF should not be used.(If anything, you should be chuffed you even got some responses from 1, let alone a few, of those busy folks!)

If you want to go ahead and use - by all means, enjoy. You've been informed so you can't complain, now.

But I'm sorry if you don't like/agree with the reasons why it's not a good product to use in current day applications. The discussion from Levi suggests changes to the future .. for what many people think are for the best. Sometimes, moving forward (for the better) means loosing some backward compat. Moving onto newer frameworks and versions might mean changing code. Make a call -> change (which incurs time + money vs features/bugs fixed) or not. MS do heaps of effort in to minimising breaking changes but sometimes it's for the better.

TL;DR;

you like using BF.
We've all been told to stop using it and use other things.
Either provide some data that refutes their analysis or quietly make a personal call and use BF or not.

Levi Broderick · Answer 34 · Tue Oct 29 2019 07:04:54 GMT+0800 (China Standard Time)

Customers are demanding that they can copy data/interact between applications, as far as I'm concerned you can move it into a separate package but BinaryFormatter can't go away as long as classic Desktop Framework WinForms/WPF client applications are around and popular.

I'm aware of this. There has also been research in abusing drag-and-drop and clipboard functionality in WinForms and WPF applications in order to gain code execution permissions. See for instance this blog post from around a year ago. That's why this issue also proposes a "safe" serializer whose payload is compatible with BinaryFormatter, which could be used by applications which for whatever reason need to interoperate safely with existing payload formats. It would of course have restricted functionality compared to a normal BinaryFormatter instance, otherwise we'd just change BinaryFormatter itself and be done with it.

From reading the responses I think there's a fear that the BinaryFormatter code will be deleted and that customers who are ok with the existing insecure functionality will be stranded. That is not what this issue is about. This issue only proposes removing it from the shared libraries, but it would still exist as a standalone package for customers who want the full functionality of the existing BinaryFormatter. If you're ok with accepting the risk of using this type in your own code, you'll still be free to bring in the package which contains it and use the existing APIs. If you want full [insecure] clipboard functionality as it exists today, you're free to pull the BinaryFormatter package into your application and wire WinForms / WPF to it. I'd strongly discourage such a thing, but at the same time I don't foresee anybody stopping an interested party from doing this.

Justin Adler · Answer 35 · Tue Oct 29 2019 07:05:13 GMT+0800 (China Standard Time)

Edit: look what popped up into my socials, literally minutes after I posted this:

https://pulsesecurity.co.nz/advisories/untitled-goose-game-deserialization

Untitled Goose Game was vulnerable to a code execution vulnerability due to unsafe deserialization in the save game loader.

talk about lol @ timings .....

Ninja edit: [secret hidden thought] I couldn't pass the opportunity to bring Untitled Goose Game into a kewl github discussion. Achievement unlocked. 🥇

Harald Mühlhoff · Answer 36 · Tue Oct 29 2019 13:42:20 GMT+0800 (China Standard Time)

@PureKrome :) with regards to your Ninja edit!

What I'm trying to understand is how use cases of BinaryFormatter differ in their security implications. There surely must be some cases worse than other, aren't there? (There are almost almost some shades of grey)

From my understanding persistently storing some BinaryFormatter output unsigned (and unencrypted) and naively reading it should be worse than exchanging it over an encrypted channel between trusted parties. Am I wrong?

For scenarios with large and complex object graphs being interchanged (yes, I would have prefered a thin web client but the customer wanted a WPF application) I still see that BinaryFormatter has some merits over other formatters. Building this systems is already hard without changing critical components on the fly. And I owe it to my customers that I really understand why such changes would have to be made.

Also I've stated from the beginning that I'm with @GrabYourPitchforks in his intentions to educate the users of BinaryFormatter about the severe security implications in many use cases and move BinaryFormatter to a separate Nuget package. I'm just not convinced that it really needs to be marked as Obsolete and be left unsupported.

I have nothing but respect for the individuals in this thread - some of them I follow on Twitter - and can only hope that my tenacity isn't interpreted to the contrary.
As someone with over 30 years experience in the field, currently a software architect and a longtime Microsoft Partner I`ve been building complex enterprise systems using & recommending Microsoft technologies and especially .NET for a while. I'm always trying to do more security-wise than has been expected of me… so I see myself as an ally and appreciate your efforts to make the platform more secure.

Justin Adler · Answer 37 · Tue Oct 29 2019 14:00:03 GMT+0800 (China Standard Time)

What I'm trying to understand is how use cases of BinaryFormatter differ in their security implications.

When Levi / Barry/ Nick / Mark (and others) are saying to stop using BF, why are you still spending energy trying to "understand use cases"? To me, this reads as: "trying to find scenario's to justify using a BF" ?

I would spend time either:

swapping out BF for something else, if time and money are ok.
or
do nothing and accept the risks with full knowledge of the problem.

I feel like the rest of the reply are attempts at trying to justify using BF, still? If you want to use it - go for it. Just remember everyone has told you the risks. If you feel like those are ok for whatever reason, then no probs .. enjoy.

I'm just not convinced that it really needs to be marked as Obsolete and be left unsupported.

Yep, we hear ya. The suggested reasons for marking it as Obsolete far outweigh your opinion, in this case? If MS are trying to put this to bed once and for all [which I feel is the intent of Levi's issue], this is the best way to communicate to everyone: "please stop using it and use something better and more modern". This is also pretty standard practice -> obsolete it (which gives people warning AND time to update, if they wish) and later on nuke it.

Sometimes change is tough but this is another case of tough-love, for the benefit of the large majority of users.

And it's not like you'll be left out in the cold or even worse, with busted systems. If they move the code to a separate package (as suggested) then you're covered. If you want to stay all modern and up-to-date, you're going to have to come along for the ride, including all changes that occur (usually for the better).

Barry Dorrans · Answer 38 · Wed Oct 30 2019 00:50:11 GMT+0800 (China Standard Time)

I'm just not convinced that it really needs to be marked as Obsolete and be left unsupported.

What support would you expect? It simply cannot be secured, we cannot touch it without breaking compatibility. It's unsupported now, it's just marking it as obsolete would make that more public.

Personal opinion time

I want to get .net to the point where you specifically have to opt-in to using insecure code patterns and insecure classes, and, eventually, remove them all together in a future major version because of their dangers. We're not doing the industry any favours by keeping easily abused code in a framework, we have to move on and help our customers find safer alternatives, or let them make the choice to stay with older versions of .net, and the lack of support that entails. Ruby does it. Swift does it. Why not .NET? Back compat is a security albatross around our necks.

Tobias Käs · Answer 39 · Wed Oct 30 2019 01:49:21 GMT+0800 (China Standard Time)

If back compat is so bad why would anyone even bother to be compatible with Desktop Framework in the first place? Right, because it isn't bad, because it is important. Its the only way to win over the devs which have investments on Desktop Framework.

UWP tried exactly that, dropping compatibility in favor of a shiny new model designed for "security". It didn't work out.

Now BinaryFormatter is just a tiny part of the big picture, and there are already a lot of things dropping compatibility on the road to .NET 5. Will it make the difference between success and failure? Probably not. But it will contribute.

Barry Dorrans · Answer 40 · Wed Oct 30 2019 01:59:37 GMT+0800 (China Standard Time)

So we shouldn't ever improve?

Harald Mühlhoff · Answer 41 · Wed Oct 30 2019 02:28:18 GMT+0800 (China Standard Time)

Why has the comment by @weltkante been marked as off-topic? If even the well-meant comments of collaborators aren't welcome here how can this be an inclusive platform for discourse?
How can Microsoft hope to better understand and serve the needs of it's community & customers if even this civilized discussion between Microsoft-friendly folks makes some "grab their pitchforks" (excuse the pun)?

When Levi / Barry/ Nick / Mark (and others) are saying to stop using BF, why are you still spending energy trying to "understand use cases"? To me, this reads as: "trying to find scenario's to justify using a BF"?

My customers don't necessarily know Levi / Barry / Nick / Mark ... So I have to understand use cases ... and so - in my humble opinion - should everyone that wants to advance security.
Listening to the (very mildy) critical voices shouldn't be that hard.

So we shouldn't ever improve?

Discuss (in a friendly way) & deeply understand the multiple different use cases in detail and work out clear migration pathes for each of them. This is hard but will ultimately lead to the best results security-wise & in customer satisfaction. Just my 5ct.

Barry Dorrans · Answer 42 · Wed Oct 30 2019 02:31:33 GMT+0800 (China Standard Time)

Because it's not really off topic, we would like to understand why people would don't want to go through an extra gesture to keep their insecure code going.

Stephen Toub · Answer 43 · Wed Oct 30 2019 02:39:25 GMT+0800 (China Standard Time)

@blowdart, somehow @weltkante's comment was hidden as off-topic, which is I believe what @HaraldMuehlhoffCC was asking about. I unhid it (and don't know how to see who hid it or why).

Tobias Käs · Answer 44 · Wed Oct 30 2019 02:39:46 GMT+0800 (China Standard Time)

Why has the comment by @weltkante been marked as off-topic?

Because I marked it as such, I collapse my comments if they aren't directly on topic but still worth posting. I agree this was a border case here.

we would like to understand why people would don't want to go through an extra gesture to keep their insecure code going.

doesn't parse for me, but I don't mind if its moved in a separate package or marked obsolete, as long as its still supported (meaning the annotations and class layouts in Core can't change because they are required for BF to work, unless you introduce a compatibility layer). I think it is very premature to declare getting rid of it entirely eventually while still having no plan for a replacement. .NET 3.1 LTS will ship with UI frameworks using BinaryFormatter as default for clipboard/drag'n'drop and you need to have technical knowledge and use some tricks to even store a different serialization format if you are programming new applications instead of porting existing ones.

.NET 5 should have full support, sure you can start marking obsolete if you provide an alternative, but IMHO it should still be in the framework, considering that 3.1 LTS has it as the default. This whole movement to getting rid of it is very much rushed.

Barry Dorrans · Answer 45 · Wed Oct 30 2019 02:40:28 GMT+0800 (China Standard Time)

Ah I didn’t notice that. Wasn’t me either.

…

________________________________ From: Stephen Toub <notifications@github.com> Sent: Tuesday, October 29, 2019 11:39:33 AM To: dotnet/corefx <corefx@noreply.github.com> Cc: Barry Dorrans <Barry.Dorrans@microsoft.com>; Mention <mention@noreply.github.com> Subject: Re: [dotnet/corefx] Remove BinaryFormatter from the shared framework in .NET 5 (#38760) @blowdart<https://github.com/blowdart>, somehow @weltkante<https://github.com/weltkante>'s comment was hidden as off-topic, which is I believe what @HaraldMuehlhoffCC<https://github.com/HaraldMuehlhoffCC> was asking about. I unhid it (and don't know how to see who hid it or why). — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub<https://github.com/dotnet/corefx/issues/38760?email_source=notifications&email_token=AAGCNCUXWIEFSEBPBY5YFALQRB7OLA5CNFSM4H2UDJJKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOECRUIEI#issuecomment-547570705>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AAGCNCQKQE5GGIMBPX4GE23QRB7OLANCNFSM4H2UDJJA>.

Harald Mühlhoff · Answer 46 · Wed Oct 30 2019 02:44:30 GMT+0800 (China Standard Time)

Good to know that @weltkante hid his comment himself. Peace! ;)

Tobias Käs · Answer 47 · Wed Oct 30 2019 03:12:32 GMT+0800 (China Standard Time)

Also I think the whole discussion about BinaryFormatter on the clipboard making a security breach possible is ill placed. The clipboard isn't a security boundary, it is a communication channel.

If a browser (or other application) is allowing placement of untrusted data of arbitrary formats on the clipboard its a browser security issue, just the same as if the browser were executing untrusted data directly.

Making it impossible for applications to communicate in the name of "security", regardless of it being the clipboard or something else, isn't providing any value to the developer or end user, its just adding a false sense of security by dumbing down the applications you can run. Eventually the developers will implement other ways to solve their communication problem between applications, be it named pipes or files in a data exchange folder or whatever else people come up with, and you have the same problem all over again (but probably worse because implemented in nonstandard ways). Removing functionality does not introduce security, it just pretends to and pushes responsibility to the one providing the replacement. Taken to the extreme, if Windows would not exist any longer it wouldn't be their security problem anymore, but the one who provides the replacement. I don't think thats a solution.

So, yes, I believe removing BinaryFormatter doesn't fix any security problems, it just shifts the blame. People who used it in the wrong context will also use the 3rd party replacement in the wrong context. Removing it will not educate them.

Barry Dorrans · Answer 48 · Wed Oct 30 2019 03:16:31 GMT+0800 (China Standard Time)

Removing BinaryFormatter doesn't stop applications communicating, it just means they need to find another way to do it, be it json or bson or anything else that is a constrained, safe, serializer. You are painting BinaryFormatter as the only way to do something and that simply isn't true.

Tobias Käs · Answer 49 · Wed Oct 30 2019 03:18:04 GMT+0800 (China Standard Time)

You are painting BinaryFormatter as the only way to do something and that simply isn't true.

It is the only builtin and comfortably usable way when using the clipboard. We need new applications to communicate with legacy applications through the clipboard/dnd. As I said before, provide a replacement, give it time to get adopted, THEN you can remove BinaryFormatter from clipboard/dragndrop.

Right now nothing of this happened, there is a discussion about false security, the whole thing looks premature and rushed. Obsoleting it now, splitting it into a separate package in .NET 5 and removing it entirely in say, .NET 6 or something (by dropping the required attributes/interfaces from Core classes), just will lead to soemeone else reimplementing it with the same "security problems". Thats what I tried to describe above.

Barry Dorrans · Answer 50 · Wed Oct 30 2019 03:35:11 GMT+0800 (China Standard Time)

Given that you can paste data to the clipboard as json with a couple of lines of code, and json is now built in, and re-hydrate to a known object type from json calling binary formatter the only built in way seems quite disingenuous.

Tobias Käs · Answer 51 · Wed Oct 30 2019 05:21:04 GMT+0800 (China Standard Time)

@blowdart either you are I are very mistaken, strings get serialized via BinaryFormatter and you can't change that because that would break compatibility with existing .NET applications (including Core 3.1 LTS if things are going the way they are, but compatibility with Desktop is my main concern at this point)

Note that I'm talking of custom formats, not the default TEXT format, as any professional application will put a user-readable string into the TEXT format, not some serialized garbage, because when the user pastes/drags into notepad or some textbox you want a readable representation. Data content goes into custom formats so receivers which understand the format can filter for it.

Its not impossible to work around it, but you need indepth knowledge of how WinForms/WPF implement clipboard/drag'n'drop. Streams get special treatment and serialize their binary content, so if you serialize to json text, then serialize the text into binary data MemoryStream, then put the stream object into the clipboard you get the treatment you want.

Note that this is entirely non-discoverable and not at all default behavior. Default behavior is using BinaryFormatter. People who are just putting their objects into a DataObject to implement copy/paste or drag'n'drop will not think a second thought about how to serialize in a custom format because it just works once you slap a [Serializable] attribute on it. I've seen that done all the time without questioning what this actually does, people just do it to get it working because thats what the exception says is the problem. Once you have published such an application any application (or future version) wanting to interact with it has to use the same format.

Developer experience:

you need a custom format to exchange complex data structures during copy/paste or drag'n'drop
the API says you can put any object into, so you put a strongly typed data structure into it
it works just fine within the same process because nothing ever gets serialized
when you try to use it cross-process you get an exception saying its not serializable, so you slap [Serializable] on it to fix the problem and support copy/paste or drag'n'drop between multiple instances of the application

at no point you are encouraged to use JSON or something else, the default for any practical purpose is BinaryFormatter

Patrick Westerhoff · Answer 52 · Wed Oct 30 2019 07:09:44 GMT+0800 (China Standard Time)

I think it is very premature to declare getting rid of it entirely eventually while still having no plan for a replacement.

This whole movement to getting rid of it is very much rushed.

As I said before, provide a replacement, give it time to get adopted, THEN you can remove BinaryFormatter from clipboard/dragndrop.

Right now nothing of this happened, there is a discussion about false security, the whole thing looks premature and rushed.

Comments like these makes me wonder if you actually read the very first post of this issue. This is a suggested plan for .NET 5 where BinaryFormatter does not ship by default but will require an additional and separate dependency to use the exact same BinaryFormatter. In addition, there is the idea of a new serializer which is similar to BinaryFormatter but safe and secure which also ships separately.

So of course, nothing happened yet. As such, there isn’t anything that is rushed at all. .NET 3.1 isn’t even here yet, so planning something like this for .NET 5 is actually very realistic. And if we do this for .NET 5, then there is really no problem with your existing applications that rely on this functionality at all since the *real BinaryFormatter will be still available; just as an opt-in functionality.

So yes, having a good and safe replacement eventually will certainly make things better but that replacement will be breaking anyway. So you will have to actively migrate to a different serializer at that point. Until then, you can continue to use BinaryFormatter if you want to accept the security implications.

C. Augusto Proiete · Answer 53 · Wed Oct 30 2019 07:57:23 GMT+0800 (China Standard Time)

@weltkante If I understand correctly, @GrabYourPitchforks's proposed plan is to move BinaryFormatter away from .NET Core. The code you linked to is from the Windows Forms codebase - a consumer of BinaryFormatter - that will be affected by this just like you and me on our desktop apps, and we'll all have to make a decision: Find/implement a better solution or accept the risk by using this separate package... The Windows Forms team included.

It doesn't seem like you appreciate how serious this problem is, though.

After watching five minutes of @blowdart's video linked above (starting at 25:53), I'd expect any developer to start looking for ways to completely remove BinaryFormatter from their codebase asap, even before Microsoft can offer a better solution - And when that comes, then great, maybe you can replace whatever solution you put in place before, with the new/better way that MS (or somebody else) comes up with...

Tobias Käs · Answer 54 · Wed Oct 30 2019 08:44:06 GMT+0800 (China Standard Time)

Ok you changed my mind, I won't bother arguing for compatibility any more. Not that it affects me and my argument, I still believe clipboard and drag'n'drop aren't a security boundary for your average UI application as the attacker must be in the same security context to access the clipboard or initiate/receive a drag'n'drop operation - but yeah, I see the problem and the risk of other people who do work on a security boundary getting it wrong. IMHO the main problem is that the security implications of using BinaryFormatter are not well known, so its probably better to take the tools away as soon as possible so people don't hurt themselves (or their users) by chosing the wrong tool (and I do mean that, not being ironic here). It's easier and has more effect than any other option.

We'll have to see if they can get it out of the default WPF and WinForms behavior in time for .NET 5.0, they'll have to provide an entirely alternate way of operating the clipboard and drag'n'drop, otherwise if you are an UI application you will automatically pull in BinaryFormatter, defeating the purpose of moving it into a separate package. Well, I guess they could pull it in without adding a reference so the API doesn't become visible, but it would still be present and implicitely used in any UI application.

Levi Broderick · Answer 55 · Wed Oct 30 2019 09:19:51 GMT+0800 (China Standard Time)

I appreciate the difficulties this might impose on WinForms applications. Check out my comments at https://github.com/dotnet/corefx/issues/38760#issuecomment-511216505 and https://github.com/dotnet/corefx/issues/38760#issuecomment-547183707 where I describe a possible way to address this. Those comments propose that though there would be no in-box support for using BinaryFormatter, there would still be extensibility hooks for hooking up custom deserializers (including the out-of-band BinaryFormatter). You could imagine that there might exist a shim / compatibility package which can be installed into a WinForms application if you require the existing behavior.

The developer flow would be that you upgrade the application and notice that the clipboard functionality doesn't work as expected for rich object graphs. (Primitives should still work just fine.) At this point you'd have a few options: use a different serialization technique if you're able, use the proposed "safe" serializer if you require payload compatibility with the existing format, or install the compatibility package if you're willing to accept the risks. Those options are very roughly ranked from safest (though taking the most work) to least safe (though taking the least work).

Jay Krell · Answer 56 · Wed Oct 30 2019 18:04:39 GMT+0800 (China Standard Time)

Can the format underlying RPC (DCE?) be used instead?

Levi Broderick · Answer 57 · Tue Nov 05 2019 10:23:17 GMT+0800 (China Standard Time)

Another proposal - if it can't be ripped out but applications want to protect themselves against improper usage of this type - would be to introduce a static API that would permanently disable the type for the lifetime of the application. For example:

public partial class BinaryFormatter
{
    // NEW API
    public static void Disable();
}

If this method is called, the BinaryFormatter type would be permanently disabled for the lifetime of the application. There would be no functionality to re-enable the type without restarting the application. Calling the BinaryFormatter APIs would throw a runtime exception.

The scenario for such a switch would be web applications who pull in DLLs which reference the BinaryFormatter type. In a nutshell, these applications don't intend on BinaryFormatter code paths being executed in the normal course of request processing, but due to external dependencies they can't remove those code paths entirely. So they'd be able to opt in to this switch to block those code paths from executing in the event a request is able to activate them unexpectedly.

Kevin Jones · Answer 58 · Tue Nov 05 2019 11:25:52 GMT+0800 (China Standard Time)

@GrabYourPitchforks how about an AppContext switch? That lets it be set via registry and environment variable so it's disabled as early as possible in the application.

EDIT: hmm. I don't think those are immutable.

Levi Broderick · Answer 59 · Tue Nov 05 2019 14:45:44 GMT+0800 (China Standard Time)

AppContext / environment variable / registry / etc. would also work. The value would be locked in the first time we read it.

Günther Foidl · Answer 60 · Tue Nov 05 2019 17:04:27 GMT+0800 (China Standard Time)

Or the other way round, to enable it explicitely.
Default is disabled, but a user has to think about it when he wants to enable it.
The runtime exception should point to a document describing why this behavior is so.

Tobias Käs · Answer 61 · Tue Nov 05 2019 17:58:31 GMT+0800 (China Standard Time)

@gfoidl then it'll just get enabled as a side effect of a random 3rd party library which thinks it needs it

Günther Foidl · Answer 62 · Tue Nov 05 2019 18:08:21 GMT+0800 (China Standard Time)

@weltkante why?

AppContext / environment variable / registry

I wouldn't use a 3rd party lib that manipulates these values.

How this is achieved depends on the implementation, but first we should find a consensus on the what.

Tobias Käs · Answer 63 · Tue Nov 05 2019 18:18:31 GMT+0800 (China Standard Time)

@weltkante why?

because thats what people do when they have a feature they need to get working.

I wouldn't use a 3rd party lib that manipulates these values.

You don't always notice immediately (or ever) what horrible workarounds or hacks they have in their codebase

I'm just saying an opt-in can get triggered by 3rd party code without you ever making the decision. It has always been the case in the past with process wide Windows API opt-ins, which are usually reserved for the application (exe) to make the decision but I have seen enough libraries which apply process wide settings anyways (and you usually don't notice until you run into problems because of it)

Eirik Tsarpalis · Answer 64 · Wed Feb 05 2020 07:35:26 GMT+0800 (China Standard Time)

I generally agree with the proposal to move BF out of shared framework. That being said, there's a counterpoint to be made about applications where remote code execution is a desired feature. One example is closure serialization, which is something used to drive distributed compute in frameworks like spark in the case of the JVM.

To my knowledge, BF is the only serializer in the shared framework that is capable of serializing closures.

Jose Perez Rodriguez · Answer 65 · Wed Jul 08 2020 02:30:48 GMT+0800 (China Standard Time)

@GrabYourPitchforks is this one still planned for 5.0 or should we adjust the milestone?

Stephen Toub · Answer 66 · Wed Jul 08 2020 02:39:32 GMT+0800 (China Standard Time)

I've moved it to future.

Levi Broderick · Answer 67 · Wed Jul 08 2020 05:07:39 GMT+0800 (China Standard Time)

Thanks for moving it. I'm finishing up the new plan and should have something to publish in the next few weeks. That should supersede this issue, at which point this can be closed.

Rachel Crowther · Answer 68 · Fri Jul 10 2020 22:44:36 GMT+0800 (China Standard Time)

sigh Having moved to use Immutable* in a few places, once again I've been hit by this same issue in .Net Core 3.1 vs. Framework - and I thought I'd managed to get away from it when I wrote my own Random ("because we won't add [Serializable] to it" and stopped using delegates. Nope.

Please can I request a more useful response than "Don't use BinaryFormatter"? What tool do I use to serialize/deserialize object graphs of classes where:

I have arbitrary graphs,
speed is important,
versioning is irrelevant,
and a third party cannot arbitrarily prevent me from serializing objects of their classes because they don't like my use case?

Levi Broderick · Answer 69 · Sat Jul 11 2020 02:26:48 GMT+0800 (China Standard Time)

and a third party cannot arbitrarily prevent me from serializing objects of their classes because they don't like my use case?

Can you clarify this? I don't quite follow. Your object graph contains instances of some arbitrary type Namespace.Foo, the type author didn't intend for that type to be serializable and didn't do the work to make it serializable, and you're looking for a tech that can correctly serialize these instances? I don't quite see how this scenario can ever work in any reliable fashion without support from the type author.

Rachel Crowther · Answer 70 · Sat Jul 11 2020 03:21:55 GMT+0800 (China Standard Time)

Can you clarify this? I don't quite follow.

Let's see!

Your object graph contains instances of some arbitrary type Namespace.Foo,

Yes.

the type author didn't intend for that type to be serializable

Don't care. Shouldn't be the type author's concern unless they're trying to ensure something special like serialization being compatible between versions, implementations, or architectures.

and didn't do the work to make it serializable,

But also didn't necessarily do any work to make it not serializable. System.Random is a good example of that; for a given version on a given architecture, it serializes just fine.

and you're looking for a tech that can correctly serialize these instances?

No. I'm looking for a tech that allows me to test - for a particular version of all code and libraries, accepting that new versions of any code may change my results - whether a particular type happens to be serializable to within my acceptable limits, and to use those test results if I find out that the type happens to work. I would strongly prefer it if the tech had warnings plastered all over it that that's all it does, that it's insecure, and that you should keep your eyes peeled when using it. Caveat user.

I don't quite see how this scenario can ever work in any reliable fashion without support from the type author.

Same as any other black / white box is made reliable in use: by eyeballing the code, testing, and by deducing that you can't do it if the tests don't pass. If I can't get reliability that's guaranteed by the author - which is quite a high burden for the author - then I want to be able to examine the type and judge reliability for myself, in my own use case.

What I don't want is for someone who doesn't know my use case to be able to prevent me from examining and testing their type to determine whether it's sufficiently usable for my use case.

Does that help?

Tobias Käs · Answer 71 · Sat Jul 11 2020 03:33:12 GMT+0800 (China Standard Time)

Your problem isn't solvable in general purpose but maybe thats not your goal

Things like interop handles and pointers cannot be serialized without cooperation of the type author. e.g. trying to forcefully serialize classes from a UI framework will not be doable without the frameworks cooperation since there is a lot of interop with the underlying platform.

Even outside of UI framework, even simple things like weak references that are tied with the runtime will not be serializable without cooperation.

Also any class can have numeric indices into static caches/arrays, so unless you serialize the whole program state of static variables those indices will be wrong upon deserialization as well.

Levi Broderick · Answer 72 · Sat Jul 11 2020 03:34:49 GMT+0800 (China Standard Time)

Sounds like you want a "damn the torpedoes" style serializer. Something that recursively iterates over all the fields in a type and bitblts them into a byte buffer somewhere, with no regard for any deserialization-related attributes or API conventions. That's an interesting use case I've never heard of before. I'm not aware of any in-box serializer that follows this logic. But in theory it should be straightforward enough to write one, assuming the serializer isn't intending to special-case pointers / handles / reflection types / etc, and just relies on the caller not passing it object graphs which contain those fields?

Rachel Crowther · Answer 73 · Sat Jul 11 2020 04:10:52 GMT+0800 (China Standard Time)

@weltkante - agree entirely that this isn't solvable in general, and you're also absolutely correct that that isn't my goal. I agree with all of your points about ways of stopping this from working, and I've no doubt there are many more. I've got considerable experience at VM and image level with Smalltalk, so the whole special classes / weak references / oddnesses thing isn't new to me. Many Smalltalks can save image segments, which are effectively portions of the Smalltalk object graph; support for reloading those and, in particular, re-linking them with what's outside the segment is distinctly variable.

@GrabYourPitchforks - Almost. It's definitely a "Doctor, it hurts when I pass the serializer something that breaks it / Well, don't do that!" serializer. I want something that respects serialization information provided by type authors when it is present, but does not conflate a lack of explicit serialization provision ("I haven't said you can...") with an explicit lack of serialization provision ("... therefore you can't"). That's more complex than a straight blitter. It's actually very close to BinaryFormatter as it currently stands in Framework; except for the combination of BinaryFormatter assuming it's locked out of a type unless it sees SerializableAttribute, and the base library authors only having SerializableAttribute to mark information for multiple different use cases and being very conservative with the types they choose to mark.

I wouldn't even mind having to define explicitly when instantiating the serializer whether it should serialize types that don't provide any guarantees on architecture or version. It's perfectly reasonable to provide barriers to this behaviour such that folks who wish to damn the torpedos have to opt into said damning, and make it very explicit that you're opting in to behaviour that is not guaranteed by the type authors.

Eirik Tsarpalis · Answer 74 · Sat Jul 11 2020 04:25:12 GMT+0800 (China Standard Time)

@Ozzard I can attest that nuget contains many serialisers that have the capabilities you require, though like you said, caveat user. So I guess the question is, what can we do from the perspective of the framework to better support such libraries?

Levi Broderick · Answer 75 · Sat Jul 11 2020 04:53:05 GMT+0800 (China Standard Time)

folks who wish to damn the torpedos have to opt into said damning, and make it very explicit that you're opting in to behaviour that is not guaranteed by the type authors.

This is already achievable to a large extent through opt-in via a surrogate selector. It allows you to take full control of the serialization pipeline for arbitrary types, including types that aren't otherwise serializable. But as you mentioned, caveat user.

using System;
using System.IO;
using System.Reflection;
using System.Runtime.Serialization;
using System.Runtime.Serialization.Formatters.Binary;

namespace Demo
{
    class Program
    {
        static void Main(string[] args)
        {
            Person p = new Person("John", "Doe");

            var ms = new MemoryStream();
            var bf = new BinaryFormatter();
            // bf.Serialize(ms, p); // this line would throw

            // !! WARNING !!
            // This is not a good idea, and it's not officially supported. Attempting to BinaryFormatter serialize
            // an instance of a type that is not marked as serializable can lead to all sorts of problems, and
            // even a seemingly innocuous servicing release could bust your application. But if you wish to
            // serialize them anyway, this sample shows how to do so.
            SurrogateSelector selector = new SurrogateSelector();
            selector.AddSurrogate(typeof(Person), new StreamingContext(StreamingContextStates.All), new SimpleReflectionSurrogate());
            bf.SurrogateSelector = selector;

            bf.Serialize(ms, p);
            ms.Position = 0; // rewind stream
            var deserializedObj = (Person)bf.Deserialize(ms);

            Console.WriteLine("Deserialized: " + deserializedObj);
        }
    }

    // n.b. Not marked [Serializable]!
    class Person
    {
        // parameterful ctor
        public Person(string firstName, string lastName)
        {
            FirstName = firstName;
            LastName = lastName;
        }

        public string FirstName { get; }
        public string LastName { get; }

        public override string ToString() => $"{LastName}, {FirstName}";
    }

    class SimpleReflectionSurrogate : ISerializationSurrogate
    {
        public void GetObjectData(object obj, SerializationInfo info, StreamingContext context)
        {
            // walk type hierarchy if necessary
            var fieldInfos = obj.GetType().GetFields(BindingFlags.Public | BindingFlags.NonPublic | BindingFlags.Instance);
            foreach (var fieldInfo in fieldInfos)
            {
                info.AddValue(fieldInfo.Name, fieldInfo.GetValue(obj));
            }
        }

        public object SetObjectData(object obj, SerializationInfo info, StreamingContext context, ISurrogateSelector selector)
        {
            // walk type hierarchy if necessary
            var fieldInfos = obj.GetType().GetFields(BindingFlags.Public | BindingFlags.NonPublic | BindingFlags.Instance);
            foreach (var fieldInfo in fieldInfos)
            {
                fieldInfo.SetValue(obj, info.GetValue(fieldInfo.Name, fieldInfo.FieldType));
            }
            return obj;
        }
    }
}

Marc Gravell · Answer 76 · Sat Jul 11 2020 05:02:26 GMT+0800 (China Standard Time)

As someone who as spent much of their career looking at serialization, I'm horrified by trivialising it; the type author should absolutely have a say in how it is serialized. If you want to serialize someone else's hostile type: stop - create your own type, and serialize that instead! As for your wishlist: I could offer you everything there except "graphs" - I'm actively trying to kill the reference-tracking support I already have in my serialization lib.

…

On Fri, 10 Jul 2020, 21:53 Levi Broderick, ***@***.***> wrote: folks who wish to damn the torpedos have to opt into said damning, and make it very explicit that you're opting in to behaviour that is not guaranteed by the type authors. This is already achievable to a large extent through opt-in via a surrogate selector. It allows you to take full control of the serialization pipeline for arbitrary types, including types that aren't otherwise serializable. But as you mentioned, *caveat user*. using System;using System.IO;using System.Reflection;using System.Runtime.Serialization;using System.Runtime.Serialization.Formatters.Binary; namespace Demo { class Program { static void Main(string[] args) { Person p = new Person("John", "Doe"); var ms = new MemoryStream(); var bf = new BinaryFormatter(); // bf.Serialize(ms, p); // this line would throw SurrogateSelector selector = new SurrogateSelector(); selector.AddSurrogate(typeof(Person), new StreamingContext(StreamingContextStates.All), new SimpleReflectionSurrogate()); bf.SurrogateSelector = selector; bf.Serialize(ms, p); ms.Position = 0; // rewind stream var deserializedObj = (Person)bf.Deserialize(ms); Console.WriteLine("Deserialized: " + deserializedObj); } } // n.b. Not marked [Serializable]! class Person { // parameterful ctor public Person(string firstName, string lastName) { FirstName = firstName; LastName = lastName; } public string FirstName { get; } public string LastName { get; } public override string ToString() => $"{LastName}, {FirstName}"; } class SimpleReflectionSurrogate : ISerializationSurrogate { public void GetObjectData(object obj, SerializationInfo info, StreamingContext context) { // walk type hierarchy if necessary var fieldInfos = obj.GetType().GetFields(BindingFlags.Public | BindingFlags.NonPublic | BindingFlags.Instance); foreach (var fieldInfo in fieldInfos) { info.AddValue(fieldInfo.Name, fieldInfo.GetValue(obj)); } } public object SetObjectData(object obj, SerializationInfo info, StreamingContext context, ISurrogateSelector selector) { // walk type hierarchy if necessary var fieldInfos = obj.GetType().GetFields(BindingFlags.Public | BindingFlags.NonPublic | BindingFlags.Instance); foreach (var fieldInfo in fieldInfos) { fieldInfo.SetValue(obj, info.GetValue(fieldInfo.Name, fieldInfo.FieldType)); } return obj; } } } — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#29976 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAAEHMGY5COHZKLWMWTXB2DR255T7ANCNFSM4KQAONBQ> .

Rachel Crowther · Answer 77 · Mon Jul 13 2020 18:46:17 GMT+0800 (China Standard Time)

@eiriktsarpalis - well, you asked :-):

Within the framework, for as many types as possible, provide a standard, known-long-term-supported way to round-trip their state, including support that allows serializers to serialize and reconstruct directed (potentially cyclic) graphs of objects.
In particular, this support should include all the common types that people use for data structures, including Immutable*.
Ideally include support for other stateful classes, such as System.Random, where the state is purely managed data.
Be clear within the framework where serialization is guaranteed to be portable between framework versions or processor architectures, so that users can accept the limitation otherwise that deserialization will fail without identical versions and architectures.
Ideally provide a serializer to make use of these capabilities, though indicating suitable third-party serializers might also work (in the same way that Newtonsoft.Json is sometimes indicated).

@GrabYourPitchforks - many thanks, Levi! I'm very happy with the warnings in the code. I expect to trip over a significant number of gotchas in use that will need ironing out; but at least it's possible to iron them out now.

@mgravell - I know the feeling. I started looking at general object serialization for C in the late '80s using (of all things) Sun's XDR libraries, as part of a project to ship object graphs between processes across TCP, and have done quite a bit of work on it since, though by no means most of my career!

I don't like the idea of taking over serialization of types without the co-operation of the type author, but consider my options here:

I apparently cannot rely on the .Net Core team supporting serialization of any type in the future; there is repeated talk of these approaches being deprecated.
The scope of the supported .Net Core types for serialization is already reduced from .Net Framework (which has already caused me problems), so there's evidence that this preference for reducing scope is being acted on.
I'm a one-person dev house. It may well be cheaper for me to change platforms to something that supports serialization than to write and maintain my own copies of things like Dictionary and ensure they remain performant.
These are not "hostile" types in the sense that someone can inject arbitrary types into the object graph or bytes into the stream for deserialization; they are known types, of known versions, whose structure and behaviour I can understand. There are clearly situations where that wouldn't be the case. I wouldn't use this approach in those situations.

As with almost all technology development, my job is to maximise the total benefit to my client of a solution I create versus the effort involved in its creation. "Benefit" is quite broadly drawn here, including things like reliability, security, and performance as well as pure functionality; and note that different projects have different trade-offs there. Given the options of a) not doing it, b) duplicating and maintaining my own versions of many already-implemented types, or c) bypassing safeguards on a few types in a known way and analysing and testing the heck out of it, my rational option is C. I'd prefer d) already having serialization support in the types I use, because that would improve my ability to get benefit for my clients out of .Net Core, but it would appear that that the .Net Core team won't give me that as an option; so let's go with what's possible.

Levi Broderick · Answer 78 · Wed Jul 15 2020 00:38:52 GMT+0800 (China Standard Time)

This plan is now out-of-date and I am closing the issue. Thanks all for the feedback!

Dan Moseley · Answer 79 · Wed Jul 15 2020 00:50:41 GMT+0800 (China Standard Time)

Linking dotnet/designs#141

dje-dev · Answer 80 · Wed Jul 15 2020 02:42:17 GMT+0800 (China Standard Time)

I have used .NET professionally for more than 15 years and my company has built a multi-million-line codebase which supports our business operations. Binary serialization is a key tool we leverage extensively internally. There is no security concern because (among other reasons) all the development is internal within a small team. The ability to serialize arbitrarily complex graphs with high performance allows us to store and then retrieve intermediate results of our complex models, facilitating validation and incremental work. Please do not remove, this would be very negatively impactful to remove an existing key functionality that we have built upon for 15 years. Feel free to reach out to me off-line for more background.

Dan Moseley · Answer 81 · Wed Jul 15 2020 03:42:23 GMT+0800 (China Standard Time)

@dje-dev the design linked above might be a place to share input.

Rachel Crowther · Answer 82 · Wed Jul 15 2020 04:13:42 GMT+0800 (China Standard Time)

@danmosemsft - given that the team appears to have ignored all other input in this area that demonstrates that people are using this functionality in real products, is there any chance of @dje-dev being listened to or are they wasting their effort?

Dan Moseley · Answer 83 · Wed Jul 15 2020 04:51:56 GMT+0800 (China Standard Time)

@Ozzard I think it is very likely that we will implement the linked proposal in some form, because of the long and continuing history of highly impactful security issues generated by BinaryFormatter use - even though we appreciate there are contexts where all data is trusted. This one is only the most recent.

Feedback on the process, timeframe and approach are certainly welcome. I know that we need to provide more guidance on other serialization options and @GrabYourPitchforks is working on that.

Dan Moseley · Answer 84 · Wed Jul 15 2020 04:53:17 GMT+0800 (China Standard Time)

It's also worth noting that the proposal relates to .NET Core 5+ (really, 6+). .NET Framework applications are not affected and .NET Framework will be supported a very long time.

https://github.com/dotnet/designs/pull/141/files#diff-9324781e4d7a839a358bc384f65a8327R7

Francesco Abbruzzese · Answer 85 · Mon Aug 24 2020 16:01:38 GMT+0800 (China Standard Time)

@danmosemsft ,

I think there is a confusion between serialization for communication and for saving/restoring the state of an object graph.

The security problem is caused by BinaryFormatter supporting Polymorphism, which is a necessary requirement for saving/restoring the state of object graphs.

While it is right banning polymorphic serializers from communication protocols, since both parts must agree on the objects to exchange, by banning them in general, we de facto prevent saving/restoring an object graph that represents the state of an ongoing computation, which is unacceptable!

Applications, of saving/restoring the state of an on-going computation, include but are not limited to:

Saving the state of a complex application (video-games, design applications, and so on).
Virtualizing stateful microservices instances, by removing them from RAM when they are not used.

While I can accept the removal of the BinaryFormatter, I can't accept the general principle of banning polymorphic serializers. While reading, the long term plan for BinaryFormatter removal it appears that the whole infrastructure will be removed including attributes, interfaces,....to prevent anyone else from providing an alternative.

Is possible to ask, at least, that the attribute infrastructure is not removed? This way people that are using BinaryFormatter for saving/restore can provide a custom alternative WITHOUT losing tonnes of code decorated with those attributes?

Tobias Käs · Answer 86 · Mon Aug 24 2020 19:36:13 GMT+0800 (China Standard Time)

Is possible to ask, at least, that the attribute infrastructure is not removed? This way people that are using BinaryFormatter for saving/restore can provide a custom alternative WITHOUT losing tonnes of code decorated with those attributes?

If its not removed but not supported either you have high risk that it erodes over time. Newly added classes won't be annotated correctly. Previously annotated classes will be changed in incompatible ways. If you keep the annotations you need to keep testing and supporting them for their purpose.

(Just stating this as a fact, not as an argument for or against removing things.)

Francesco Abbruzzese · Answer 87 · Mon Aug 24 2020 21:34:29 GMT+0800 (China Standard Time)

@weltkante,
Also possible to remove BinaryFormatter but still to keep support for the infrastructure, that is, for custom implementations of IFormatter.