frapierri / WPCC

Weighted Pearson Correlation Coefficient Calculator for Python

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

WPCC

Weighted Pearson Correlation Coefficient Calculator for Python

INSTALLATION

Download and extract the WPCC-0.1a.tar.gz archive wherever convenient.

From the WPCC-0.1a folder, execute the following command:

>>> python3 setup.py install

This will install the module to the appropriate directory for third-party modules.

DESCRIPTION

WPCC contains a single function (wpearson), that allows users to calculate a weighted Pearson Correlation Coefficient for two vectors. This was primarily written to find gene co-expression patterns when given a list of genes with accompanying vectors of normalized expression data. The weights are based on the redundancy of samples from which the expression values were taken.

For more information concerning weighted PCC calculation and gene-coexpression, please see the following journal article:

Rank of Correlation Coefficient...Obayashi and Kinoshita, 2009 (NCBI)

OPTIONS

To calculate weighted PCC, supply the wpearson function with three equal-length vectors. The first two vectors should be those for which correlation is being calculated. The third should contain the weights. The order of all vectors matters.

>>> import wpcc

>>> dal = [8,22,88,84,21]
>>> san = [8,32,80,82,35]
>>> wts = [0.9,0.7,0.5,0.3,0.2]

>>> wpcc.wpearson(dal,san,wts)

0.9819

An optional rounding parameter can be supplied as a fourth option for the number of decimal places desired. The default is 4.

>>> wpcc.wpearson(dal,san,wts,8)
0.98193095

INPUT

Each of the three vectors should be entirely numerical and ordered. Using the above example, a weight of 0.5 corresponds to values 88 and 80 for the dal and san vectors, respectively.

Assigining an equal-length vector of identicial weights (i.e. [1,1,1,1,1]) will allow you to calculate an unweighted PCC.

About

Weighted Pearson Correlation Coefficient Calculator for Python

License:MIT License


Languages

Language:Python 100.0%