qascade / dcr

A PoC framework to orchestrate interoperable Differentially Private Data Clean Room Services using Intel SGX hardware as root of trust.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

feat: add an example of a join query with a group by but using confidential go app.

qascade opened this issue · comments

Description

The use case only does a simple count query without any partitions. I want to add an example of partitions that properly demonstrates the use case of the maxContributionsPerUsers option inside the google dp definition.

How to do this ?

Use the same media/ advertiser/research data set for simplicity, although you are free to run your imagination for a different scenario you will have to generate your datasets.

You can do a query like What are the common customers grouped by the kind of pet they have. So the output should be:
Private Count of Customers who have dogs, Private Count of Customers who have cats.

In SQl terms the query should look like:

SELECT
    ac.pets,
    COUNT(DISTINCT ac.email) AS count_common_customers
FROM
    media_customers mc
INNER JOIN
    airline_customers ac ON mc.email = ac.email
GROUP BY
    ac.pets;

So the end result should be a Confidential GoApp with appropriate Yaml modifications that functions exactly as the above query would.

Hey,
I want to work on this issue. Can you assign me on this issue under Gssoc'23.

@Sarthak027 Awesome. Assigning this to you. You will have to understand Google's differential privacy definition and how it works. Please note, SQL is just for understanding you will have to write a go template that compiles and does same thing as what SQL query will do. Let me know if you have any further questions. Name the branch feat.join_goapp

Google Repo: https://github.com/google/differential-privacy
Video: https://youtu.be/1F6pRMVGWdc
Paper: https://arxiv.org/abs/1909.01917