ShokofehVS / SecBic-BCCA

A series of homomorphically biclustering algorithms (SecBic) -- Bi-Correlation Clustering Algorithm -- using CKKS scheme with Pyfhel Libary over gene expression data sets

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

SecBic-BCCA

SecBic-BCCA: Secured Biclusterings - Bi-Correlation Clustering Algorithm: privacy-preserving gene expression data analysis by biclustering algorithm -- bi-correlation clustering algorithm -- over gene expression data with Homomorphic Encryption operations such as sum, or matrix multiplication in Python under the MIT license.

We apply Pyfhel as a python wrapper for the Microsoft SEAL library on top of the existing implementation of the algorithm in biclustlib.

Installation

First you need to ensure that all packages have been installed.

  • See requirements.txt
  • numpy>=1.23.1
  • setuptools>=65.5.0
  • pandas>=1.5.0
  • scikit-learn>=1.1.1
  • Pyfhel>=3.3.1
  • matplotlib>=3.5.2
  • scipy>=1.9.0
  • munkres>=1.1.4

You can clone this repository:

   > git clone https://github.com/ShokofehVS/SecBic-BCCA.git

If you miss something you can simply type:

   > pip install -r requirements.txt

If you have all dependencies installed:

   > pip3 install .

To install Pyfhel, on Linux,gcc6 for Python (3.5+) should be installed. (more information regarding installation of Pyfhel )

   > apt install gcc 

Biclustering Algorithm

Biclustering or simultaneous clustering of both genes and conditions as a new paradigm was introduced by Cheng and Church's Algorithm (CCA). The concept of bicluster refers to a subset of genes and a subset of conditions with a high similarity score, which measures the coherence of the genes and conditions in the bicluster. It also returns the list of biclusters for the given data set.

Gene Expression Data Set

Our input data is yeast Saccharomyces cerevisiae cell cycle taken from Tavazoie et al. (1999) which was used in the orginal study by Cheng and Church;

External Evaluation Measure

To measure the similarity of encrypted biclusters with non-encrypted version, we use Clustering Error (CE) as an external evaluation measure that was proposed by Patrikainen and Meila (2006);

About

A series of homomorphically biclustering algorithms (SecBic) -- Bi-Correlation Clustering Algorithm -- using CKKS scheme with Pyfhel Libary over gene expression data sets

License:MIT License


Languages

Language:Python 100.0%