Ben Savage: Need to define Privacy

Question

Ben Savage: Need to define Privacy

ekr opened this issue 2 years ago · comments

Note: I am moving gdoc comments into this repo.

Ben Savage writes:

Should we somehow define what we mean by "privacy" here? If we do not, I can foresee potential future problems where a proposal author might assert that a proposal provides "privacy", but others disagree that it does so - due to a lack of a shared definition of what "privacy" means.

He suggests:

"No entity should be able to reconstruct a big chunk of their browsing history". I'll try to add a sentence.

James Rosewell · Answer 1 · Thu Mar 24 2022 01:44:59 GMT+0800 (China Standard Time)

Here is how the CMA and Google dealt with this in their commitments as background and possible replication in the charter.

What Google accepted was that it needed to define “Personal Data” with reference to “Applicable Data Protection legislation” in relation to GDPR. Thus, “Privacy” is not itself defined, but instead, “Personal Data” is defined in relation to GDPR via the definition of “Applicable Data Protection Legislation”. In essence, the definition of “Personal Data” used in conjunction with the definition of “Applicable Data Protection Legislation” represents the definition of Privacy. Please see below.

Personal Data

“Personal Data” means personal data as defined in the Applicable Data Protection Legislation;” - as defined in Google’s commitments.

In the final commitments, the CMA accepted Google’s definition of “Personal Data” and stated in paragraph 4.47 that: “In the Final Commitments, Google has clarified the definition of ‘Google First Party Personal Data’ by replacing ‘data’ with the defined term ‘Personal Data’. The CMA considers that this clarifies that the data referred to in this definition is Personal Data (as defined under the Applicable Data Protection Legislation) collected from Google’s user-facing services or services on the Android operating system as deployed in smartphones, connected televisions or other smart devices. In accordance with the current legislative framework, this term does not apply to data from which a living individual cannot be identified. The term does not apply to, for example, truly anonymised data.”

Applicable Data Protection Legislation

“Applicable Data Protection Legislation” means all applicable data protection and privacy legislation in force in the UK, including the Data Protection Act 2018, the UK General Data Protection Regulation (and regulations made thereunder) and the Privacy and Electronic Communications (EC Directive) Regulations 2003;” - as defined in Google’s commitments.
In the final commitments the CMA accepted that the definition offered by Google was sufficient – and in paragraph 4.70 it stated that: “The CMA considers that a clarification to the definition of ‘Applicable Data Protection Legislation’ is not necessary. The current wording of the definition allows it to apply to both future updates and other relevant legislation.”

For the avoidance of doubt the Privacy Principles produced by TAG as currently drafted would not be an acceptable reference point as it does not align to laws but seeks to define quasi-laws. In that regard its inclusion would result in a similar issue to #10 .

Alex Cone · Answer 2 · Thu Mar 24 2022 03:00:57 GMT+0800 (China Standard Time)

I would like to hear from an official agent of the UK CMA (or its recently announced "Monitoring Trustee" Ing Bank N.V.) what the UK CMA views as its role in PATCG and privacy-related standards proposals and W3C Community Group reports that do not originate from Google. It would be incredibly helpful if PATCG participants could understand from UK CMA (or Ing Bank N.V. as appropriate) directly if UK CMA intends to take a proactive and/or gatekeeping role in PATCG proceedings.

Until then I don’t think there’s any reason to keep invoking CMA on so many threads, @jwrosewell. You’ve made your position clear.

Aram Zucker-Scharff · Answer 3 · Thu Mar 24 2022 03:10:19 GMT+0800 (China Standard Time)

Yes, it's very difficult to try to incorporate a CMA suggestion without the CMA making that suggestion. The chairs are, of course, open to speaking with and hearing directly from the CMA or hearing from W3C's management a directive they are passing along. Without one of those things happening I don't think we can consider the CMA's definitions in this regard as anything other than a factor that contributes to our decision on definitions, which I always appreciate, but is--in this respect--a factor, not the whole.

James Rosewell · Answer 4 · Fri Mar 25 2022 00:43:26 GMT+0800 (China Standard Time)

In relation to this issue I'm providing analysis of the agreement between Google and CMA in relation to the approach they took to define Privacy to highlight how some of the most current thinking is evolving to help understand the importance and complexity in resolving this important issue. I'm not suggesting that the CMA get involved in addressing the specific PAT WG charter draft, however chairs or Google might wish to request their input directly.

BTW @alextcone I believe ING will be monitoring, including Google involvement in this group, but not actively getting involved.

Robin Berjon · Answer 5 · Tue Mar 29 2022 06:18:44 GMT+0800 (China Standard Time)

It would be inappropriate for a Working Group to operate in a way that contravened TAG principles. If there are problems with TAG-applicable documents, the right place to take that up is the TAG. Charters don't get to decide whether TAG principles apply or not.

The terminology and details vary between jurisdictions, but norms form a hierarchy in which typically each layer cannot break what is required by more fundamental layers but always adds its own constraints. Laws can't be unconstitutional but are more restrictive than constitutions, regulations cannot be illegal but are more restrictive than laws, etc. Standards are just another layer. They can't break more fundamental layers: no standard could establish that it's OK to kill people. And they add their own restrictions: >p< will not generate a p element. And this pattern repeats inside standard organisations: groups operate inside the boundaries set by the architectural oversight bodies, and add restrictions of their own. Standards are built from MUST clauses. A standard that added no restrictions beyond the law would be effectively empty.

Aram Zucker-Scharff · Answer 6 · Tue Mar 29 2022 23:17:51 GMT+0800 (China Standard Time)

I agree with @darobin here, we are already working within W3C principles and cannot contravene them as part of the W3C. Dealing with those principles and suggesting changes to them can be done on the appropriate level, which is not here.

As for defining privacy within this document, I do think we could do add a link to the privacy principles document once that is created might be a useful way to go?

James Rosewell · Answer 7 · Wed Mar 30 2022 01:04:19 GMT+0800 (China Standard Time)

A standard that added no restrictions beyond the law would be effectively empty.

A standard that helped enforce the law would align with the law and be anything but empty. A standard that via MUST clauses created new quasi-laws risks unintended consequences such as restricting competition or choice.

I do think we could do add a link to the privacy principles document.

Privacy principles document as currently drafted defines quasi-laws and is anti competitive in a similar way to the S&P questionnaire. I do not consider either document the basis for work to bring genuine improvements to advertising and privacy.

Robin Berjon · Answer 8 · Wed Mar 30 2022 02:02:21 GMT+0800 (China Standard Time)

I'm not sure that there is much point in furthering this discussion. A standard is defined by the set of normative statements it contains — everything else is fluff. A standard with MUST statements isn't a standard, it's just someone's opinion. If you want standards to not be standards or not work like standards I don't know who to refer you to but that's certainly not something that this group could change, even if it wanted to.

I have yet to see any argument or evidence to support your repeated accusations against the questionnaire and the principles, but if you have arguments or evidence that you've chosen not to share, the right place to do so is with the TAG.

James Rosewell · Answer 9 · Tue Apr 05 2022 04:06:52 GMT+0800 (China Standard Time)

I have yet to see any argument or evidence to support your repeated accusations against the questionnaire and the principles

@darobin Re: Security and Privacy Questionnaire. Information has been provided to TAG and AC which you have had the opportunity to see. In relation to the FO extensive information has been provided to W3C Management, and W3C Director, by lawyers qualified in the field. We await the Directors decision which I sincerely hope will involve the creation of a Legal Advisory group to enable a group of qualified lawyers (not you or me) to inform the rest of the W3C about the competition impact of their work and make recommended modifications. You and I can then defer to the Legal Advisory group and focus our energies on other aspects of the web. In relation to the Privacy Principles; I stopped monitoring them when it was clear the creation of quasi-laws is very much the goal of the self appointed group, and as such await a version the group consider ready for review before commenting in further detail.

Robin Berjon · Answer 10 · Tue Apr 05 2022 06:08:34 GMT+0800 (China Standard Time)

I think that statements can be supported either with simple reasoning, or with reference to supporting research/evidence, or, ideally, both.

I am aware of all of the information you provided to the AC and TAG. None of it features either reasoning or research that supports your claims. Now you hint at "extensive information" that appears to have been provided under the cover of secrecy to just some people. If that "extensive information" actually supports your point of view, why keep it secret?

It seems pretty intuitive that we get a more level playing field if third parties get access to the same data as first parties instead of more. As documented under public scrutiny, I have found a lot of research and policy work supporting such a conclusion (and hope to soon have time to summarise more of the field).

The TAG is not "self-appointed" but rather a mix of elected by the Membership and nominated by the Director. With that legitimacy is has appointed a task force to produce a draft that will go through the same kind of broad and stringent public review that supports the questionnaire.

You keep repeating "quasi-law, quasi-law, quasi-law" — the word you're looking for is "standard." What you're describing is a standard. Calling it a standard when it aligns with your personal preferences and a "quasi-law" when it doesn't isn't an argument, it's just a rhetorical trick.

For those who are curious to understand how standards work in this respect, a short primer. A key expectation of a standard is that you should be able to give it to two separate groups of people who don't communicate and they will produce compatible outcomes. This applies to bits on the wire standards of course, but also on broader architectural standards (since they are intended to help produce compatible output from working groups). This means that they have to be as clear and as clear-cut as you can make them. These aren't advice columns that each and everyone of us can take to heart in our own personal way. Wiggle room in standards is always a failure (even if sometimes unavoidable because the world is complicated).

This is captured through the classic RFC2119 keywords MAY, MUST, and MUST NOT (SHOULD and SHOULD NOT indicate wiggle room, and are often frowned upon — though slightly less so in architectural standards). The standards community has aligned on these because they work, and also because they can be formalised using (one of the forms of) deontic logic. I won't bore you all with the details, but one thing worth noting is that assuming you can use negation then may/must/must not are mutually definable (eg. MUST(a) = ¬MAY(¬a)) so that you can't have for instance a system with just MAY and no form of MUST. Formally, applying these "deontics" to statements about technical systems is to specify its behaviour, and when such a specification is accepted in a given institutional context that's a standard.

Those of you who are more into humanities might have noticed that the deontics used in standards are the same as those used in institutional analysis. That's not a coincidence: standards are a form of institutional governance and they very often operate like a commons. That is certainly the case with the Web.

So yes, all standards create rules. Calling them "quasi-laws" is just inventing a boogeyman by describing what they do. Picking which rules matters. That's why we do it in public, with values, with arguments, with evidence, with research, with broad feedback and not with harassment, not with secret "extensive information," not with anonymous supporters, not with made-up boogeyman words.

James Rosewell · Answer 11 · Tue Apr 05 2022 19:19:06 GMT+0800 (China Standard Time)

A few points in response to @darobin.

None of it [the information provided to TAG and AC] features either reasoning or research that supports your claims

Suffix the sentence with “that Robin is prepared to acknowledge”.

Others have engaged with the substance, and many agree with it.

As one example of research related to the need for W3C to upgrade antitrust provision (which is member only visible) here is a list of the citations provided.

BUSINESSEUROPE: Business compliance with competition law. November 2011
International Chamber of Commerce (ICC), Toolkit for SMEs, Why complying with competition law is good for your business, 2015
International Chamber of Commerce (ICC), Commission on Competition: Antitrust Compliance Toolkit, Practical antitrust compliance tools for SMEs and larger companies, 22 April 2013
International Chamber of Commerce (ICC): Compliance as an antitrust law enforcement tool
International Chamber of Commerce: Promoting antitrust Compliance: the various approaches of national antitrust authorities . April 2011
The Chief Legal Officers Round Table (CLO) compliance working group: CLO Compliance Blue Print, endorsed by the ICC. July 2010
National competition authorities
Autorité de la Concurrence: La prévention des infractions: les programmes de conformité aux règles de concurrence, February 2012
Framework document (available in French and English) and brochure (French and English)
Office of Fair Trading (OFT): Guidance on competition compliance, June 2011

why keep it secret?

Because the details relate to the governance of W3C and legal matters which I consider are better dealt with between W3C Director, W3C Management, Hosts, and the raiser of the FO rather than in the public domain. There is also the continued request by W3C Counsel to not mention legal matters and the possible CEPC interpretations associated with such matters. It is therefore a matter for W3C Management to decide if they wish to publish the information provided to them along with their response. I respect that this is their choice and common when dealing with disputes of this nature. @darobin attempts to discredit me by suggesting this is uncommon.

Re: Privacy Taskforce

The “usual suspects” might be a better choice of words than “self-appointed”. Let’s look at the member affiliation groups of Privacy Taskforce.

Browser vendors: 4
Large publishers: 2
Privacy advocates: 3
Others: 1
Law makers: 0
Lawyers: 0
Small publishers: 0
Advertisers: 0
AdTech: 0
Total members: 10

Just like Movement for an Open Web I would like to judge the outcome of the work rather than the identity of the participants that created it. Unfortunately, the direction to date is not encouraging.

Re: Standards and Quasi-Laws

When I use the term “quasi-law” I’m not referring to technical standards for interoperability. That is a mischaracterisation @darobin introduces in an attempt to discredit me. Clearly the alignment of bits in a communication protocol has to be agreed upon in order for it to work. I’m referring to documents like the Privacy Principles which define quasi-laws by treating the law as a floor not a ceiling. No person is above or below the law. It is not the role of @darobin, or others on the Privacy Taskforce, to make laws for the rest of the web. When they do those rules become “quasi-laws” unrelated to technical interoperability. I trust this explanation clarifies my meaning. No bogeymen here.

Robin Berjon · Answer 12 · Tue Apr 05 2022 23:55:18 GMT+0800 (China Standard Time)

None of it [the information provided to TAG and AC] features either
reasoning or research that supports your claims

Suffix the sentence with “that Robin is prepared to acknowledge”.
Others have engaged with the substance, and many agree with it.

I note that once again these people who have "engaged with the substance" and "many agree with it" are an anonymous group.

As one example of research related to the need for W3C to upgrade
antitrust provision https://github.com/w3c/AB-memberonly/issues/88
(which is member only visible) here is a list of the citations provided.

Let's roll back the discussion here:

You repeat (again) your claim that the S&P questionnaire is anti-competitive: "is anti competitive in a similar way to the S&P questionnaire."
I simply point out that you have yet to back those accusations: "I have yet to see any argument or evidence to support your repeated accusations against the questionnaire."
You claim that that information exists and even that I have seen it: "Security and Privacy Questionnaire. Information has been provided to TAG and AC which you have had the opportunity to see."
I point out, quite simply, that none of what you have provided supports your claims: "None of it features either reasoning or research that supports your claims."

And in response to this you provide… a completely unrelated list of documents about general-purpose antitrust compliance.

You're trying to frame this as something that somehow I personally am being difficult about, that I am not "prepared to acknowledge" your material. But… you literally just produced a list of completely unrelated documents and changed the topic.

Re: Privacy Taskforce

Browser vendors: 4

Large publishers: 2

Privacy advocates: 3

Others: 1

Law makers: 0

Lawyers: 0

Small publishers: 0

Advertisers: 0

AdTech: 0

Total members: 10

I'm not even going to bother engaging here, but that categorisation/count is wildly inaccurate.

Re: Standards and Quasi-Laws

When I use the term “quasi-law” I’m not referring to technical standards
for interoperability.

Interoperability isn't just bit for bit within one standard. Those of us who buy and sell ads on the Internet need to coordinate any number of technologies in order to succeed. Ensuring that the stack operates according to coherent rules is key to ensuring successful interoperation.

In fact, that's the situation we have with the legacy system. It has rules like:

Third parties MUST be able to recognise people across contexts.
Third parties MUST be allowed to inject arbitrary code into first parties.
Browsers MUST NOT enforce purpose limitations.
…

These rules are what coordinate the many complex pieces in play in legacy advertising. The second you start to change these rules, as we have seen very clearly over the past years, things stop working seamlessly together. These rules also have consequences (supply chains cannot be trusted, fraud, malvertising, etc.).

You're trying to draw a line such that rules you don't like are called "quasi-laws" but rules you like (eg. #16 that states that personal data MUST be made freely available) are somehow magically exempt from this classification.

We pick rules because, again, that's what standards are. We pick them explicitly so that they can be reviewed, not by pretending that they aren't rules.

benjaminsavage · Answer 13 · Thu Apr 07 2022 00:43:03 GMT+0800 (China Standard Time)

@ekr - thanks for opening this issue.

I have been thinking about potential wording for this which we could put into the charter, and here's a proposal:

"Private Advertising Technologies standardised by the working group should minimally not enable either unwanted cross site recognition or unwanted same-site recognition."

Brad Lassey · Answer 14 · Thu Apr 07 2022 04:16:56 GMT+0800 (China Standard Time)

unwanted site-site recognition."

I assume this was intended to be "same-site" and not "site-site"

Not enabling unwanted cross-site recognition seems aligned with Mozilla's and Safari's anti-tracking polices as well as Chrome's proposed privacy model for the web.

Preventing same-site recognition goes a step beyond that (though perhaps unwanted could be seen as roughly equivalent to covert in Safari's policy). Maybe this is a good addition, but that's probably a good debate to have.

Brad Lassey · Answer 15 · Thu Apr 07 2022 05:09:03 GMT+0800 (China Standard Time)

Preventing same-site recognition goes a step beyond that (though perhaps unwanted could be seen as roughly equivalent to covert in Safari's policy). Maybe this is a good addition, but that's probably a good debate to have.

I just want to clarify one bit, the text that Ben linked to effectively reads as "we should not create new supercookies", and I certainly agree with that. But I want to be explicit that I don't think we have the goal of preventing same-site recognition between times a user clears their cookies and caches (as opposed to across clears).

benjaminsavage · Answer 16 · Thu Apr 07 2022 09:10:36 GMT+0800 (China Standard Time)

Thanks @bslassey for catching that typo! Yes, I intended to say "unwanted same-site recognition". Comment edited.

Yeah, I think we are saying the same thing here:

By "unwanted same-site recognition" I was specifically referring to re-identifying users across clears of cookies and caches.

In IPA for example, this is an explicit goal of ours. If someone chooses to clear their cookies and caches, we do not want IPA to provide a vector of re-identification which would enable the site to realise it's the same person.

I think this is aligned with what you're saying - right?

Alex Cone · Answer 17 · Thu Apr 07 2022 19:17:24 GMT+0800 (China Standard Time)

The adjective “unwanted” is going to end up carrying more water than you might expect or…want. Determining and subsequently demonstrating whether something is wanted or not is less than straightforward and, as it relates to digital advertising, explosive and in flux. I suggest removing “unwanted” or at least not using a euphemism for consent, which this group is not scoped to design for in any sort of holistic, consistent across all data processing manner. User controls are certainly important to consider as we advance designs for individual purpose limited APIs. For example, I think @benjaminsavage’s example that IPA shouldn’t work if user agent storage or device storage is cleared might be something we choose to standardize. But we can do that without adding ambiguity to the charter.

Martin Thomson · Answer 18 · Thu Apr 07 2022 20:50:11 GMT+0800 (China Standard Time)

Does unsanctioned work better?

benjaminsavage · Answer 19 · Thu Apr 07 2022 20:57:56 GMT+0800 (China Standard Time)

@alextcone - you're right. I think it's probably better to be specific.

Perhaps this:

"Private Advertising Technologies standardised by the working group should minimally not enable either cross site recognition or same-site recognition which links identity across the clearing of cookies and caches."

Don Marti · Answer 20 · Thu Apr 07 2022 21:01:50 GMT+0800 (China Standard Time)

@benjaminsavage How about "facilitate" or "contribute to" instead of "enable" ? (a technology that gets an attacker a significant step toward either of these would still be a problem)

Brad Lassey · Answer 21 · Thu Apr 07 2022 21:21:25 GMT+0800 (China Standard Time)

In IPA for example, this is an explicit goal of ours. If someone chooses to clear their cookies and caches, we do not want IPA to provide a vector of re-identification which would enable the site to realise it's the same person.

I think this is aligned with what you're saying - right?

Yes, it sounds like we're aligned. Thanks for the clarification.

Alex Cone · Answer 22 · Thu Apr 07 2022 21:45:12 GMT+0800 (China Standard Time)

In response to @martinthomson's question:

Does unsanctioned work better?

I certainly do not think we should insert a new term in this group's charter when "unsanctioned" is what TAG uses. Though I think over time we're likely to see evolution in even the term "unsanctioned" which to me still has an aura of passing a lot of requirements off on the end user and could be difficult to affirm. Anyway, I think if we have to use any word it should be in sync with TAG.

In response to @benjaminsavage proposed language change:

"Private Advertising Technologies standardised by the working group should minimally not enable either cross site recognition or same-site recognition which links identity across the clearing of cookies and caches."

The same-site clause above seems very tied to current technologies like cookies and caches even if you are using the term "like" as a way to say "this is just an example." Perhaps something like this?

"Private Advertising Technologies standardised by the working group must keep what can be learned about an individual user over time to same-site (or app in the event these standards are adopted by operating systems) and respect user-level controls."

Martin Thomson · Answer 23 · Thu Apr 07 2022 23:45:34 GMT+0800 (China Standard Time)

I think that "cross-site" needs a hyphen.

Martin Thomson · Answer 24 · Thu Apr 07 2022 23:54:01 GMT+0800 (China Standard Time)

is this the only privacy-relevant thing that needs to be captured in the charter? or are we going to try to recapitulate the entire privacy principles thing?

Aram Zucker-Scharff · Answer 25 · Thu Apr 07 2022 23:57:46 GMT+0800 (China Standard Time)

Looking forward to the PR on this, thanks!

Nick Doty · Answer 26 · Tue Apr 26 2022 02:45:22 GMT+0800 (China Standard Time)

I don't think it's wise to pick two small privacy-relevant threats as the only parts of privacy that will be relevant to the Working Group's work. Certainly in conducting privacy reviews of this work, I'm not going to consider only two very narrow pieces of a larger threat model.

Privacy does not mean just limiting cross-site or same-site tracking. Privacy is not just protection from unsanctioned tracking as previously described by the TAG: that doc also wasn't trying to set out a comprehensive definition of privacy violations, it was responding to a very particular subset of tracking done without cookies because of the lack of transparency/control by the end user.

And I don't think it's wise for the charter to attempt a comprehensive consensus definition of every relevant part of privacy to advertising technology. That's not generally the purpose of charters, and it would foreclose on any of the patcg deliverables on how privacy applies to advertising technology.

Brian May · Answer 27 · Tue Apr 26 2022 05:15:47 GMT+0800 (China Standard Time)

I agree with @npdoty and strongly favor having the charter only note that the intent is for features to be privacy preserving and that they provide appropriate privacy guarantees, as I suggest here.

We are most certainly going to want to identify and explore privacy threats, but I think that is appropriately done in the context of use-cases or proposals or documents dedicated to the subject, not the charter.

With respect to the more general question about defining privacy which prompted this issue, I suspect that will be a very contentious endeavor, because whether or not something is private tends to be very subjective and context specific. On the other hand, the notion of privacy seems rooted in the sharing of information which is an objective function, so perhaps we would do better to focus on information flows and providing means of classifying and controlling them, rather than attempting to discern which of them is privacy invasive.

benjaminsavage · Answer 28 · Thu May 12 2022 10:21:41 GMT+0800 (China Standard Time)

I don't think it's wise to pick two small privacy-relevant threats as the only parts of privacy that will be relevant to the Working Group's work. Certainly in conducting privacy reviews of this work, I'm not going to consider only two very narrow pieces of a larger threat model.

Privacy does not mean just limiting cross-site or same-site tracking. Privacy is not just protection from unsanctioned tracking as previously described by the TAG: that doc also wasn't trying to set out a comprehensive definition of privacy violations, it was responding to a very particular subset of tracking done without cookies because of the lack of transparency/control by the end user.

And I don't think it's wise for the charter to attempt a comprehensive consensus definition of every relevant part of privacy to advertising technology. That's not generally the purpose of charters, and it would foreclose on any of the patcg deliverables on how privacy applies to advertising technology.

This is good feedback, and I agree.

I left a comment on @bslassey's pull request suggesting this language instead:

The purpose of these features is to support web advertising in private ways. Here "private" refers to appropriate processing of personal information. Examples of ways in which new web platform features might enable inappropriate processing include (but are not limited to):

enabling cross-site recognition of users
enabling same-site recognition of users across > the clearing of state or data."