shivanigithub / http-cache-partitioning

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Alternative triple-keying

annevk opened this issue · comments

Did you consider always using the second-level frame as the third key? https://bugzilla.mozilla.org/show_bug.cgi?id=1558932#c3 outlines rationale for doing it that way.

I don't understand the bug. How does A get to attack B and C's caches? A only gets to attack AA, not AB or AC as they're different partitions.

A can create another frame and load B in there (which will use A+B) and time navigations.

Sorry, I'm still not following. Can you describe the attack you're considering in more detail?

https://a.example.com/ embeds https://b.example.com/ where the user logs in. https://b.example.com/ loads some specific images based on the user's logged in state. https://a.example.com/ creates a second frame where it times these images.

I'm assuming we're partitioning by origin for the sake of this discussion.

An image that B loads will be keyed as (a.example.com, b.example.com, b.example.com/img.png).

If A creates a same-origin frame and loads b.example.com/img.png, it will be keyed as (a.example.com, a.example.com, b.example.com/img.png) and will result in a cache miss.

If A creates another B frame, and the B frame loads the resource again, it will be a cache hit. But that's not something that A can observe.

I'm talking about A navigating the frame to the image. I guess that could still use A's cache, but not sure if that is what happens in practice (and there might well be issues with embed/object there).

Ah, I see. Yes, good point. IMHO onload/onerror should not be allowed for x-origin navigations by default.

The alternative proposed in the bug doesn't mitigate the problem, it just changes the circumstances in which it can happen.

Frankly, this is a good argument for including the entire ancestor chain in the key.

Hmm, even if we had the entire ancestor chain in the key, a frame could still spy on its child frame by loading resources that it wants to inspect in new frames. So frankly, I think the problem isn't the key here but the fact that we allow x-origin onload/onerror.

It could attack children, but at least not siblings or parents. (There's only load for frames and that leaks in a number of ways so I don't think we can do away with that.)

I want to take a closer look at deprecating load for x-origin frames and adding a random delay before calling it. Maybe years down the road we could phase it out entirely.

(There's only load for frames and that leaks in a number of ways so I don't think we can do away with that.)

What other leaks are you aware of?

Enumerating frame children in various ways, observing identity changes to the cross-origin Location object (this might be fixable by keeping a single proxy of sorts). Also, I think they currently delay the load event of the parent, so you'd have to change that which might break a lot of scripts in the wild.

Thanks!

My take is that we want to isolate all origins in the page, which is what we get by appending the requesting frame to the key.

Chrome intends to use the same partitioning logic for its entire network stack (e.g., dns, socket pools, session identifiers, etc.). I don't want to compromise the third key for the cache's sake to the detriment of the rest of the stack. Especially since the compromise is due to x-origin leaks that really should be fixed anyway.

Whenever I talk about cache I don't mean the HTTP cache exclusively, to be clear. Also, I don't think you really addressed #2 (comment) thus far.

Because of that we're considering using the entire chain as cache key.

I don't think using top frame and highest child frame helps much. It means second-level frames can attack third-level and below, and vice versa, no?

Using the entire chain doesn't seem to help protect child frames from their parents checking if they've loaded a resource. If a.example iframes b.example, it can just iframe b.example/subresource to see if b.example has loaded that subresource, right? It does provide protections the other way around, though.

Perhaps it would help if you explained the attacks you are interested in addressing. It's hard to evaluate what helps otherwise.

I just did? It's the exact same attack you describe in comment comment #3. Use an iframe and navigation timing to figure out if a resource is in the cache - I assume you were thinking subresources loaded by another iframe.

And as I said - "If a.example iframes b.example, it can just iframe b.example/subresource to see if b.example has loaded that subresource, right?" Load timing would presumably look different if it were in the cache vs on disk.

More generally, any frame can poke at subresources its direct descendents load, since it can create a frame using the same key at will, as long as top-level resource path is not part of the key, and keys are not unique per-frame. (edit: Oops, by "top-level" I meant frame's src location)

The choices of key being discussed only affect other things - whether/when iframes can poke at the cache of their siblings, parents, uncles, etc. They can always attack their direct child frames, though not necessarily other descendents.

I meant clarifying the benefits of top + current (which I believe you are proposing) over top + highest.

Ah, sorry for not being clear. I wasn't actually advocating any particular solution, just pointing out what attacks are still possible. A parent can always iframe-probe its children, and that with the alternative proposed here (top-level + highest-level iframe), children of iframes can still iframe-probe their siblings and their parents. The only things it gets us is that the top level frame cannot by iframe-probed by its descendants, and the top-level frames iframes cannot probe each other (And their descendents can't probe other top-the level frame's iframes, though they can probe the one they appear in). I'm not a fan of making the top two levels of frames magic, unless sites can detect whether or not they're a top-level iframe - and even then, sites would only get protection if actually detected that they're in a lower level iframe, and guarded against that case, and they'd have to be very careful about who they iframe when they're in an iframe themselves. That having been said, I'm not advocating for a specific alternative.

The thing that top-level frame site+innermost iframe site gets us is that iframes can't probe cross-site iframes directly (with XHRs, image tags, etc) - instead, they can only use iframe timing. I'm not sufficiently familiar with CORS protections to know how much that gets us. But it does indeed provide no protection at all against iframe+timing-style attacks.

Here's an idea: For use top-frame+innermost iframes, but for iframe root resources themselves, use a special frame-only key (Can just add a bool to the key or something, to make it unique). iframes could still be used to probe redirects of iframe's main resource, but they couldn't be used to probe subresources. Could alternatively use the parent frame's key for iframes, but that seems like a bad idea to me, even if I can't find a particular reason to avoid that approach.

Link rel headers received from the iframe root resource would need to be sure to use the key used by subresources.

The big downside is that this requires yet more round trips to open an iframe, if we isolate the socket pools as well. There are mitigations against this, though. Could either preconnect using the subframe key, too, or only respect the additional bool at the cache layer, and not the socket layer, though the latter still allows other frames to probe for live connections to origins that might have been used to request subresources.

Either suggestion here does still allow all frames to probe for cross-site top-level iframes URLs loaded by any other frame - the only real advantage it offers is protecting subresources.

Anyhow, just tossing out an option to consider and knock down as not viable.

@annevk: Is there a more public place to discuss the choice of key? We've decided to go with separating out frame responses from the disk cache, specifically to address the cache-timing issue, though everything else (including sockets) are reused between frame requests and subresource requests.

We've also starting to think we might want to use the entire ordered path to the root frame as the key (merging SameSite children with their parents), which would allow us to describe SameSite:Lax cookies in terms of the Network Partition Key, and give us a consistent (if a bit complicated) story with respect to how partitioned cookies work (both SameSite:Lax and Strict), since it sounds like that's the direction we're going in. We're planning to start the investigation by measuring frame depths, to get a rough idea of how expensive the extra keys will be - probably also a good idea to try to figure out how many more of them there will be.

This would also let FirstPartySets have a single story for how it works with cookies and everything else,.

I think https://github.com/privacycg/storage-partitioning/issues would be the best place in terms of getting input from all the relevant stakeholders. Fetch is probably still the place where we want to standardize it.

I suspect a write-up would help since there are probably a fair number of people coming into it fresh.

Note that I'm not really convinced the current story for cookies is actually sound: httpwg/http-extensions#1288.

I think a write-up of problems and options is a really good idea. I'll put one together when I have time, and file a storage partitioning issue with the privacy CG with a link.

@annevk: Sorry to go back on what I said, but looks like I'm not going to have time to invest in this, as my team's focus is sites tracking users across first party contexts, while any keys beyond top frame site are strictly security/privacy issues related to sites attacking each other within a single first party context.

Thanks for the update! Does that mean Chrome will only start using a single key for now? Should we update Fetch?

I think we'll probably just stick with the current scheme for now, since it does provide some protections from snooping. I'd really like to see a consensus reached before we start applying First Party Sets logic to Network Partition Keys, assuming FPS makes it that far, so may push for allocating resources on this if we reach that point.

Note that Chrome is still in the midst of launching its cache partitioning. But the key that we're using is <top-frame-origin, document-origin, is-iframe-navigation, url>.

The new part is the is-iframe-navigation boolean isolates iframe navigation resources from its subresources. It's still possible to snoop on the documents loaded by other frames via the load event, but at least this prevents one from snooping on subresources.

Edit: Note that we're only planning on using this extra boolean in the key in the cache, not the rest of the network stack.

Note that Chrome is still in the midst of launching its cache partitioning. But the key that we're using is <top-frame-origin, document-origin, is-iframe-navigation, url>.

<top-frame-site, document-site, is-iframe-navigation, url> and site is scheme://etld+1

Ah, thank you Shivani! Yes, schemeful site, not origin.

So I'm a bit confused, it sounds like you still plan to use additional keys for at least the HTTP cache. I'm guessing that's mainly for security. It seems worthwhile discussing the details of that further, no?

Yes, the plan is to use the top-frame site and document site for the http cache, blink's memory cache, and all of the networking stack. As you say the document site gives us security wins as opposed to privacy. If we run into performance issues we may still revert back to just top-frame site.

Definitely worth discussing the details further. Our intent is still to protect frames from each other if possible. One loophole to that scheme was the load event of subframe navigations, as you pointed out. We narrowed that attack by further partitioning subframe navigation resources from subresource requests in the disk cache partitioning scheme only. This means it's still possible to detect if a warm socket or dns cache exists for the request, but at least the most revealing attack (the http cache) is limited in scope.

https://a.example.com/ embeds https://b.example.com/ where the user logs in. https://b.example.com/ loads some specific images based on the user's logged in state. https://a.example.com/ creates a second frame where it times these images.

If two websites A and B want to collude to transfer information between each other, are you proposing that this can somehow be stopped by triple partitioning? It seems that there are multiple ways for them to achieve their objective:

  1. Just agree to use postMessage between each other
  2. Use HTTP URLs for passing information on how to set up a back-channel, and use a back-channel between their servers to send information to and from them

@EGreg If two origins on a page wish to exchange data, that's totally fine. We're not trying to prevent that. We're trying to prevent unauthorized/unwanted data leakage between origins on a page, which is a security concern.