reneroliveira / fhe-and-statistics

Undergraduate Thesis - A survey on fully homomorphic encryption (FHE) with statistical applications

Home Page:https://hdl.handle.net/10438/33823

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

A survey on fully homomorphic encryption with statistical applications

Undergratuate Thesis - Rener Oliveira

Abstract

The amount of data generated by individuals and enterprises is growing exponentially over the last decades, which empowers the use of machine learning methods since, for statistical purposes, the more data a model can have access to, the more accurately it will predict or represent reality. The problem emerges when the model must deal with sensitive data such as medical records, financial history, or genomic data, in which additional care must be taken in order to protect the privacy of data owners. Encrypting sensitive data might appear a good solution at first sight, but it can considerably limit the ability to do statistical analysis. This work is a survey on Fully Homomorphic Encryption (FHE), a special kind of cryptography scheme that still permits some machine learning methods to run over encrypted data, while it has strong mathematical guarantees of privacy protection.

Contents

Table of Contents (click to expand)
  1. INTRODUCTION
  2. ALGEBRAIC REVIEW
    • 2.1 - Basic structures
    • 2.2 - Homomorphisms and Quotient Rings
    • 2.3 - Cyclotomic polynomials
    • 2.4 - Lattices
      • 2.4.1 - Lattice Problems
      • 2.4.2 - Ring Learning with Errors
  3. FULLY HOMOMORPHIC ENCRYPTION
    • 3.1 - Privacy Homomorphisms
      • 3.1.1 - Requirements and Limitations
    • 3.2 - Bootstrappable encryption
      • 3.2.1 - Overview and Bootstrapping
      • 3.2.2 - An integer scheme
      • 3.2.3 - Practical considerations and further research
    • 3.3 - FHE over the complex numbers
      • 3.3.1 - Encoding and Decoding
      • 3.3.2 - Encryption, Decryption, and Relinearization
      • 3.3.3 - Approximate Bootstrapping
  4. PRIVATE LOGISTIC REGRESSION
    • 4.1 - Statistical Review
    • 4.2 - Homomorphic Training
      • 4.2.1 - Ciphertext packing and data representation
      • 4.2.2 - Batch Inner Product
    • 4.3 - Data Applications
  5. CONCLUSIONS AND FURTHER WORK
  6. APPENDIX A - AN IDEAL LATTICE SCHEME
    • A.1 Initial definitions
    • A.2 Abstract construction
    • A.3 Concrete construction using ideal lattices

Results

Summarized results of encrypted logistic regression training.

  • Dataset: Subset of the TissueMNIST dataset; 92672 rows and 196 columns;
  • Machine: 64-bit quad-core Intel Core i5-6200U 2.3GHz CPU, 16GB of RAM;
  • CKKS parameters: $N=2^{15}, q=2^{45}$ (more details here).
  • FHE Performance:
KeyGen time Encrypt time Training time Public key size
8.52 min 19.17 min 6.19 hours 2.62 GB
  • Model performance:
Encrypted Unencrypted
Accuracy 64.6211% 64.6363%
AUC 81.7039% 81.6996%

About

Undergraduate Thesis - A survey on fully homomorphic encryption (FHE) with statistical applications

https://hdl.handle.net/10438/33823


Languages

Language:TeX 100.0%