Scripts to build heatmap figures from copy number variation data from ASCAT files in Genomic Data Commons.
This exercise had the following objectives:
- To explore the ASCAT files format.
- To practice the use of snakemake for workflow implementation.
- To get insight into the variability of cnv data in different types of cancer for future analysis.
The pipeline uses snakemake for a reproducible workflow to run the following steps for ovary, prostate, pancreas, bladder, skin, brain, testis, liver, esophagus, breast, lung, kidney, colorectal, uterus, and thyroid tissues.
- Query GDC to get manifest files for ASCAT and RNASeq data for normal and tumor conditions. File:
src/queryGDC.py
- Download data in manifest files using the GDC Data Transfer Tool, also in
bin/gdc-client
- Build ASCAT matrix (genes in rows and samples in columns)
- Get CNV heatmap figure (blue for 0-1 values, white for 2, red for +2)