Marius1311 / Adnet

Repository for the analysis of the paper "Influence of node abundance on signaling network state and dynamics analyzed by mass cytometry"

Home Page:http://www.bodenmillerlab.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

adnet/bpR2 analysis

This repository contains the analysis scripts used for the paper: "Influence of node abundance on signaling network state and dynamics analyzed by mass cytometry" Xiao-Kang Lun, Vito RT Zanotelli, James D Wade, Denis Schapiro, Marco Tognetti, Nadine Dobberstein & Bernd Bodenmiller

Installation:

The package can be installed using the python package manager pip. Using a virtual environment or an environment manager such as Anaconda is highly recommended. The code is compatible with Python2.7. The installation was mainly tested on Ubuntu 14.04, but was also found to work on OSX and Windows 7.

  1. clone the github repsitory: git clone https://github.com/BodenmillerGroup/Adnet.git

  2. install it with pip: pip install -e ./Adnet

The dependencies ('numpy', 'scipy', 'pandas','matplotlib', 'seaborn', 'configparser2','nose', 'argparse') should be automatically installed.

Analysis workflow:

The analysis consists of a main workflow which is configured using a '.ini' configuration file and the data organized in a specific folder structure. Ipython notebooks are used for the downstream analysis and further visualization of the results.

File structure requirements:

The overall folder structure should be as follows:

  • Experiment: a folder corresponding to one cyTOF experiment, that can contain multiple barcoding plates.

    • AcquisitionA: a folder corresponding to an cyTOF acquisition barcoding plate. Contains:

      • AcquisitionA.csv: A csv file with the metadata, see example for exact structure.

      • gated: a folder containing the data

        • xy_rowcolumn_xy.fcs: fcs files with '_' seperated fields in the name and the rowcolumn information.
    • AcquisitionB: identical structure

  • name_dict.csv: a comma seperated files with 2 columns: old, new old: column name as used in the .fcs file new: renamed column name as should be used in the analysis plots. Non ASCII characters can give problems.

  • config.ini: a configuration file with all the parameters used for the analysis. Please look at the specifications in the example file 'example/config_documentation.ini'

Run analysis

After installation of the package (above) the analysis can be run as follows:

python -m adnet.adnet_analysis /pathto/config.ini

Output

Depending on the settings of the config.ini file, the following output will be generated:

  • outfolder: folder defined in the config.ini
    • config.ini: a copy of the config.ini used for the analysis

    • bin_dat: a pandas pickle file containing the summary statistics generated by the analysis. Can be loaded as pandas.read_pickle("bin_dat")

    • complete_dat: a pandas pickle file containing the single cell data used by the analysis. Can be loaded as pandas.read_pickle("complete_dat")

    • Cutoff.pdf: the histogram showing the cutoff chosen in relation to the negative controls

    • Plots as png, pdf: plots are strongly depending on the configuration specified in the file.

Examples

In the root directory is an 'example' folder which contains configuration files for the 3 analyses from the paper.

Please change the paths in the config ini file to match the current repository location or make sure you are in the example folder.

Afterwards the analysis can be run as follows (assuming you are in the example folder, otherwise adapt the path):

python -m adnet.adnet_analysis ./config_xxx.ini

Main analysis

Runs the bpR2 analysis for the 20 overexpressions. Because of data storage reasons only 2 replicates of 1 ovexpression group is included in this repository. However the other folders are already prepared and the FCS files from the data repository simply need to be copied in. To activate the other folders just uncomment the folder section.

Main analysis allvsall

Calculates bpR2 and correlation for all pairwise marker combinations. The generated data can be used for correlation heatmaps (see Notebooks). Not all data included.

Mutations analysis

Runs the bpR2 analysis for the mutation data. All data included.

Mutations analysis allvsall

Calculates bpR2 and correlation for all pairwise marker combinations. The generated data can be used for correlation heatmaps (see Notebooks). All data included.

Tag analysis

Runs the bpR2 analysis for the comparison of differentially tagged overexpressions, i.e. Flag-N, Flag-C, GFP-N, GFP-C. All data included.

Figure reproduction

The figures from the paper can be reproduced with the following code:

Figures

  • Fig 1: no actual data shown

  • Fig 2:

    • a) via cytobank
    • b) from Example 'Mutation analysis', plots 'Trends_EGF_overexpression_marker'
    • c) via cytobank
    • d) via cytobank
    • e) & f) from notebooks/correlation_heatmaps.ipynb
    • g) & h) no code provided
  • Fig 3: no code provided

  • Fig 4:

    • a)-h) from example 'Main analysis', plots 'EGF_overexpression_...', 'Trends_EGF_overexpression_marker_BinsoverTP'
    • i) from notebooks/kinetic_analysis.ipynb
  • Table 2: from notebooks/SIGNOR_analysis.ipynb

Supplemenatry Figures

  • Sup. Figure 3:

    • e) from notebooks/tag_comparison.ipynb
  • Sup. Figure 9:

    • b)-e) from example 'Mutation analysis' plots
  • Sup. Figure 10:

    • c) from example 'Main analysis', plot 'cutoff'
  • Sup. Figure 11:

    • a) from noteboooks/readout_comparison.ipynb
  • Sup. Figure 13:

    • a) & b) from notebooks/supplementary_fig13_heatmaps.ipynb
  • Sup. Figure 14: -from notebooks/kinetic_analysis.ipynb

Supplementary Files

  • Supplementary File 1, 2, 4: from example 'Main analysis'
  • Supplementary File 3: from notebooks/correlation_heatmaps.ipynb

References

This repository uses code from the following projects:

  • Matplotlib: John D. Hunter. Matplotlib: A 2D Graphics Environment, Computing in Science & Engineering, 9, 90-95 (2007), DOI:10.1109/MCSE.2007.55

  • Seaborn: https://github.com/mwaskom/seaborn

  • Scipy: Jones E, Oliphant E, Peterson P, et al. SciPy: Open Source Scientific Tools for Python, 2001-, http://www.scipy.org/ [Online; accessed 2017-01-04].

  • Numpy: Stéfan van der Walt, S. Chris Colbert and Gaël Varoquaux. The NumPy Array: A Structure for Efficient Numerical Computation, Computing in Science & Engineering, 13, 22-30 (2011), DOI:10.1109/MCSE.2011.37

  • Pandas: Wes McKinney. Data Structures for Statistical Computing in Python, Proceedings of the 9th Python in Science Conference, 51-56 (2010)

  • Ipython: Fernando Pérez and Brian E. Granger. IPython: A System for Interactive Scientific Computing, Computing in Science & Engineering, 9, 21-29 (2007), DOI:10.1109/MCSE.2007.53

  • Fcm: https://pythonhosted.org/fcm/

About

Repository for the analysis of the paper "Influence of node abundance on signaling network state and dynamics analyzed by mass cytometry"

http://www.bodenmillerlab.org

License:MIT License


Languages

Language:Jupyter Notebook 97.0%Language:Python 3.0%