rsnemmen / partial-correlation

A test for partial correlation with censored astronomical data

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

PARTIAL CORRELATION COEFFICIENT AND SIGNIFICANCE FOR CENSORED DATA

The code is based on the methodology presented in 'A test for partial correlation with censored astronomical data', Akritas & Siebert, MNRAS, 278, 919 (1996).

What is this for?

The idea here is suppose you have measurements for two variables, X and Y. X and Y correlate well with each other. However, they mutually correlate with a third variable Z, which you have also measured. How can you be sure that the correlation you see between X and Y is not actually driven by Z?

One important astronomical example is when you are studying the correlation between luminosities at different bands—say X-rays and radio—for a sample of sources. The "hidden variable" Z in this case is the luminosity distance dL, which you used to convert from fluxes to luminosities.

This statistical test quantifies the p-value for the null hypothesis Pnull of no correlation between X and Y taking into account the effect of Z. If Pnull is high, then your X-Y correlation is caused by both variables depending on Z.

Installation

Make sure you have a fortran (sorry) compiler such as gfortran or pgfortran. This code was originally written in 1995, so be understanding.

Compile it with the command

gfortran -O cens_tau.f -o cens_tau

or by running

./make.sh

Usage

1.. Put your data in an ASCII file with the following structure (no need for the first line of cols in the file OK?):

col1 col2 col3 col4 col5 col6
 X  censX  Y  censY Z  censZ  
  • X: independent variable
  • Y: dependent variable
  • Z: test variable
  • censX, censY, censZ: integer which is 1 if X/Y/Z is a detection or 0 if it is an upper limit

The following python snippet can be useful. Suppose you have all variables each stored in a numpy array. To create an ASCII file with the appropriate structure to be processed by cens_tau, issue the following command:

# "censored tag" array if all your data points are detections
censX=numpy.ones_like(X,dtype=numpy.int)

numpy.savetxt(fileout, transpose((X,censX,Y,censY,Z,censZ)), fmt='%10.4f %i %10.4f %i %10.4f %i')

2.. Run the test

./cens_tau

asciicast

If you want to test this code with artificial data, first run gendata.py which will generate a mock dataset in the file test01.dat where X and Y both are correlated with Z.

Citation

If you use this code in your work and it gets published, you are morally obliged to cite the original paper: 'A test for partial correlation with censored astronomical data', Akritas & Siebert, MNRAS, 278, 919 (1996).

I also ask you to cite Nemmen, R. et al. Science, 2012, 338, 1445 (bibtex citation info) as one of the examples of application of the this test. I spent some time improving this code, so I would appreciate your citation of my paper as a token of gratitute. Thanks! 🙂

About

A test for partial correlation with censored astronomical data

License:MIT License


Languages

Language:Fortran 95.3%Language:Python 4.1%Language:Shell 0.6%