Alternative triple-keying

Question

Alternative triple-keying

annevk opened this issue 5 years ago · comments

Did you consider always using the second-level frame as the third key? https://bugzilla.mozilla.org/show_bug.cgi?id=1558932#c3 outlines rationale for doing it that way.

Josh Karlin · Answer 1 · Sat Jan 25 2020 04:01:47 GMT+0800 (China Standard Time)

I don't understand the bug. How does A get to attack B and C's caches? A only gets to attack AA, not AB or AC as they're different partitions.

Anne van Kesteren · Answer 2 · Thu Jan 30 2020 15:06:42 GMT+0800 (China Standard Time)

A can create another frame and load B in there (which will use A+B) and time navigations.

Josh Karlin · Answer 3 · Thu Jan 30 2020 20:35:28 GMT+0800 (China Standard Time)

Sorry, I'm still not following. Can you describe the attack you're considering in more detail?

Anne van Kesteren · Answer 4 · Thu Jan 30 2020 21:00:58 GMT+0800 (China Standard Time)

https://a.example.com/ embeds https://b.example.com/ where the user logs in. https://b.example.com/ loads some specific images based on the user's logged in state. https://a.example.com/ creates a second frame where it times these images.

Josh Karlin · Answer 5 · Thu Jan 30 2020 21:18:38 GMT+0800 (China Standard Time)

I'm assuming we're partitioning by origin for the sake of this discussion.

An image that B loads will be keyed as (a.example.com, b.example.com, b.example.com/img.png).

If A creates a same-origin frame and loads b.example.com/img.png, it will be keyed as (a.example.com, a.example.com, b.example.com/img.png) and will result in a cache miss.

If A creates another B frame, and the B frame loads the resource again, it will be a cache hit. But that's not something that A can observe.

Anne van Kesteren · Answer 6 · Thu Jan 30 2020 21:24:12 GMT+0800 (China Standard Time)

I'm talking about A navigating the frame to the image. I guess that could still use A's cache, but not sure if that is what happens in practice (and there might well be issues with embed/object there).

Josh Karlin · Answer 7 · Thu Jan 30 2020 22:36:22 GMT+0800 (China Standard Time)

Ah, I see. Yes, good point. IMHO onload/onerror should not be allowed for x-origin navigations by default.

The alternative proposed in the bug doesn't mitigate the problem, it just changes the circumstances in which it can happen.

Frankly, this is a good argument for including the entire ancestor chain in the key.

Josh Karlin · Answer 8 · Fri Jan 31 2020 00:51:07 GMT+0800 (China Standard Time)

Hmm, even if we had the entire ancestor chain in the key, a frame could still spy on its child frame by loading resources that it wants to inspect in new frames. So frankly, I think the problem isn't the key here but the fact that we allow x-origin onload/onerror.

Anne van Kesteren · Answer 9 · Sat Feb 01 2020 22:31:11 GMT+0800 (China Standard Time)

It could attack children, but at least not siblings or parents. (There's only load for frames and that leaks in a number of ways so I don't think we can do away with that.)

Josh Karlin · Answer 10 · Mon Feb 03 2020 22:44:02 GMT+0800 (China Standard Time)

I want to take a closer look at deprecating load for x-origin frames and adding a random delay before calling it. Maybe years down the road we could phase it out entirely.

(There's only load for frames and that leaks in a number of ways so I don't think we can do away with that.)

What other leaks are you aware of?

Anne van Kesteren · Answer 11 · Tue Feb 04 2020 17:14:18 GMT+0800 (China Standard Time)

Enumerating frame children in various ways, observing identity changes to the cross-origin Location object (this might be fixable by keeping a single proxy of sorts). Also, I think they currently delay the load event of the parent, so you'd have to change that which might break a lot of scripts in the wild.

Josh Karlin · Answer 12 · Tue Feb 04 2020 22:13:54 GMT+0800 (China Standard Time)

Thanks!

My take is that we want to isolate all origins in the page, which is what we get by appending the requesting frame to the key.

Chrome intends to use the same partitioning logic for its entire network stack (e.g., dns, socket pools, session identifiers, etc.). I don't want to compromise the third key for the cache's sake to the detriment of the rest of the stack. Especially since the compromise is due to x-origin leaks that really should be fixed anyway.

Anne van Kesteren · Answer 13 · Thu Feb 20 2020 21:00:37 GMT+0800 (China Standard Time)

Whenever I talk about cache I don't mean the HTTP cache exclusively, to be clear. Also, I don't think you really addressed #2 (comment) thus far.

Because of that we're considering using the entire chain as cache key.

Matt Menke · Answer 14 · Wed Jul 22 2020 01:14:45 GMT+0800 (China Standard Time)

I don't think using top frame and highest child frame helps much. It means second-level frames can attack third-level and below, and vice versa, no?

Using the entire chain doesn't seem to help protect child frames from their parents checking if they've loaded a resource. If a.example iframes b.example, it can just iframe b.example/subresource to see if b.example has loaded that subresource, right? It does provide protections the other way around, though.

Anne van Kesteren · Answer 15 · Wed Jul 22 2020 01:30:09 GMT+0800 (China Standard Time)

Perhaps it would help if you explained the attacks you are interested in addressing. It's hard to evaluate what helps otherwise.

Matt Menke · Answer 16 · Wed Jul 22 2020 01:33:46 GMT+0800 (China Standard Time)

I just did? It's the exact same attack you describe in comment comment #3. Use an iframe and navigation timing to figure out if a resource is in the cache - I assume you were thinking subresources loaded by another iframe.

Matt Menke · Answer 17 · Wed Jul 22 2020 01:36:20 GMT+0800 (China Standard Time)

And as I said - "If a.example iframes b.example, it can just iframe b.example/subresource to see if b.example has loaded that subresource, right?" Load timing would presumably look different if it were in the cache vs on disk.

Matt Menke · Answer 18 · Wed Jul 22 2020 01:46:18 GMT+0800 (China Standard Time)

More generally, any frame can poke at subresources its direct descendents load, since it can create a frame using the same key at will, as long as top-level resource path is not part of the key, and keys are not unique per-frame. (edit: Oops, by "top-level" I meant frame's src location)

The choices of key being discussed only affect other things - whether/when iframes can poke at the cache of their siblings, parents, uncles, etc. They can always attack their direct child frames, though not necessarily other descendents.

Anne van Kesteren · Answer 19 · Wed Jul 22 2020 18:13:07 GMT+0800 (China Standard Time)

I meant clarifying the benefits of top + current (which I believe you are proposing) over top + highest.

Matt Menke · Answer 20 · Wed Jul 22 2020 21:20:21 GMT+0800 (China Standard Time)

Ah, sorry for not being clear. I wasn't actually advocating any particular solution, just pointing out what attacks are still possible. A parent can always iframe-probe its children, and that with the alternative proposed here (top-level + highest-level iframe), children of iframes can still iframe-probe their siblings and their parents. The only things it gets us is that the top level frame cannot by iframe-probed by its descendants, and the top-level frames iframes cannot probe each other (And their descendents can't probe other top-the level frame's iframes, though they can probe the one they appear in). I'm not a fan of making the top two levels of frames magic, unless sites can detect whether or not they're a top-level iframe - and even then, sites would only get protection if actually detected that they're in a lower level iframe, and guarded against that case, and they'd have to be very careful about who they iframe when they're in an iframe themselves. That having been said, I'm not advocating for a specific alternative.

The thing that top-level frame site+innermost iframe site gets us is that iframes can't probe cross-site iframes directly (with XHRs, image tags, etc) - instead, they can only use iframe timing. I'm not sufficiently familiar with CORS protections to know how much that gets us. But it does indeed provide no protection at all against iframe+timing-style attacks.

Matt Menke · Answer 21 · Thu Jul 23 2020 03:09:12 GMT+0800 (China Standard Time)

Here's an idea: For use top-frame+innermost iframes, but for iframe root resources themselves, use a special frame-only key (Can just add a bool to the key or something, to make it unique). iframes could still be used to probe redirects of iframe's main resource, but they couldn't be used to probe subresources. Could alternatively use the parent frame's key for iframes, but that seems like a bad idea to me, even if I can't find a particular reason to avoid that approach.

Link rel headers received from the iframe root resource would need to be sure to use the key used by subresources.

The big downside is that this requires yet more round trips to open an iframe, if we isolate the socket pools as well. There are mitigations against this, though. Could either preconnect using the subframe key, too, or only respect the additional bool at the cache layer, and not the socket layer, though the latter still allows other frames to probe for live connections to origins that might have been used to request subresources.

Either suggestion here does still allow all frames to probe for cross-site top-level iframes URLs loaded by any other frame - the only real advantage it offers is protecting subresources.

Anyhow, just tossing out an option to consider and knock down as not viable.

Matt Menke · Answer 22 · Tue Nov 03 2020 01:56:45 GMT+0800 (China Standard Time)

@annevk: Is there a more public place to discuss the choice of key? We've decided to go with separating out frame responses from the disk cache, specifically to address the cache-timing issue, though everything else (including sockets) are reused between frame requests and subresource requests.

We've also starting to think we might want to use the entire ordered path to the root frame as the key (merging SameSite children with their parents), which would allow us to describe SameSite:Lax cookies in terms of the Network Partition Key, and give us a consistent (if a bit complicated) story with respect to how partitioned cookies work (both SameSite:Lax and Strict), since it sounds like that's the direction we're going in. We're planning to start the investigation by measuring frame depths, to get a rough idea of how expensive the extra keys will be - probably also a good idea to try to figure out how many more of them there will be.

This would also let FirstPartySets have a single story for how it works with cookies and everything else,.

Anne van Kesteren · Answer 23 · Tue Nov 03 2020 16:36:11 GMT+0800 (China Standard Time)

I think https://github.com/privacycg/storage-partitioning/issues would be the best place in terms of getting input from all the relevant stakeholders. Fetch is probably still the place where we want to standardize it.

I suspect a write-up would help since there are probably a fair number of people coming into it fresh.

Note that I'm not really convinced the current story for cookies is actually sound: httpwg/http-extensions#1288.

Matt Menke · Answer 24 · Wed Nov 04 2020 01:34:41 GMT+0800 (China Standard Time)

I think a write-up of problems and options is a really good idea. I'll put one together when I have time, and file a storage partitioning issue with the privacy CG with a link.

Matt Menke · Answer 25 · Sat Nov 21 2020 00:54:22 GMT+0800 (China Standard Time)

@annevk: Sorry to go back on what I said, but looks like I'm not going to have time to invest in this, as my team's focus is sites tracking users across first party contexts, while any keys beyond top frame site are strictly security/privacy issues related to sites attacking each other within a single first party context.

Anne van Kesteren · Answer 26 · Mon Nov 30 2020 21:57:03 GMT+0800 (China Standard Time)

Thanks for the update! Does that mean Chrome will only start using a single key for now? Should we update Fetch?

Matt Menke · Answer 27 · Mon Nov 30 2020 22:03:32 GMT+0800 (China Standard Time)

I think we'll probably just stick with the current scheme for now, since it does provide some protections from snooping. I'd really like to see a consensus reached before we start applying First Party Sets logic to Network Partition Keys, assuming FPS makes it that far, so may push for allocating resources on this if we reach that point.

Josh Karlin · Answer 28 · Mon Nov 30 2020 22:03:46 GMT+0800 (China Standard Time)

Note that Chrome is still in the midst of launching its cache partitioning. But the key that we're using is <top-frame-origin, document-origin, is-iframe-navigation, url>.

The new part is the is-iframe-navigation boolean isolates iframe navigation resources from its subresources. It's still possible to snoop on the documents loaded by other frames via the load event, but at least this prevents one from snooping on subresources.

Edit: Note that we're only planning on using this extra boolean in the key in the cache, not the rest of the network stack.

Shivani Sharma · Answer 29 · Mon Nov 30 2020 22:07:23 GMT+0800 (China Standard Time)

Note that Chrome is still in the midst of launching its cache partitioning. But the key that we're using is <top-frame-origin, document-origin, is-iframe-navigation, url>.

<top-frame-site, document-site, is-iframe-navigation, url> and site is scheme://etld+1

Josh Karlin · Answer 30 · Mon Nov 30 2020 22:09:26 GMT+0800 (China Standard Time)

Ah, thank you Shivani! Yes, schemeful site, not origin.

Anne van Kesteren · Answer 31 · Mon Dec 07 2020 18:15:36 GMT+0800 (China Standard Time)

So I'm a bit confused, it sounds like you still plan to use additional keys for at least the HTTP cache. I'm guessing that's mainly for security. It seems worthwhile discussing the details of that further, no?

Josh Karlin · Answer 32 · Mon Dec 07 2020 22:27:00 GMT+0800 (China Standard Time)

Yes, the plan is to use the top-frame site and document site for the http cache, blink's memory cache, and all of the networking stack. As you say the document site gives us security wins as opposed to privacy. If we run into performance issues we may still revert back to just top-frame site.

Definitely worth discussing the details further. Our intent is still to protect frames from each other if possible. One loophole to that scheme was the load event of subframe navigations, as you pointed out. We narrowed that attack by further partitioning subframe navigation resources from subresource requests in the disk cache partitioning scheme only. This means it's still possible to detect if a warm socket or dns cache exists for the request, but at least the most revealing attack (the http cache) is limited in scope.

Gregory Magarshak · Answer 33 · Mon Nov 21 2022 23:21:35 GMT+0800 (China Standard Time)

https://a.example.com/ embeds https://b.example.com/ where the user logs in. https://b.example.com/ loads some specific images based on the user's logged in state. https://a.example.com/ creates a second frame where it times these images.

If two websites A and B want to collude to transfer information between each other, are you proposing that this can somehow be stopped by triple partitioning? It seems that there are multiple ways for them to achieve their objective:

Just agree to use postMessage between each other
Use HTTP URLs for passing information on how to set up a back-channel, and use a back-channel between their servers to send information to and from them

Josh Karlin · Answer 34 · Mon Nov 21 2022 23:28:10 GMT+0800 (China Standard Time)

@EGreg If two origins on a page wish to exchange data, that's totally fine. We're not trying to prevent that. We're trying to prevent unauthorized/unwanted data leakage between origins on a page, which is a security concern.