shishenyxx / MCD_mosaic

Malformations of cortical development (MCD) are neurological conditions displaying focal disruption of cortical architecture and disrupted cellular organization that occurs during embryogenesis. This repository contains the pipelines for data processing, codes for data analysis and plotting for the large-scale MCD data analysis.

Home Page:https://www.nature.com/articles/s41588-022-01276-9

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

MCD_mosaic

Malformations of cortical development (MCD) are neurological conditions displaying focal disruption of cortical architecture and disrupted cellular organization that occurs during embryogenesis. This repository contains the pipelines for data processing, codes for data analysis and plotting for the large-scale MCD data analysis. Data for this project is available on NIMH Data Archive (NDA) under study accession 1484, and on NIMH Sequence Read Archive (SRA) under accession number PRJNA821916. Raw single cell RNA-seq data are provided on the Single Cell Portal.

Nature_Genetics_Cover


1. Pipelines for processing WES data

The processing of the WES data followed the BSMN common pipeline, of which the WES part is described also in the BSMN common experinment paper.


2. Pipelines for processing MPAS data

Alignment and pre-processing of the MPAS data were derived the published MPAS pipeline.

The WES and MPAS pipelines were further implemented into a generalized snakemake pipeline version.


3. Mosaic variants calling

Processed data from WES and MPAS were subjected to different pipelines and candidate mosaic variants were collected: sample-specific variants were called using the paired modes using the BSMN common pipeline; sample-shared and single mode variants were either called with GATK haplotyper polidy50 according to the BSMN common pipeline (WES only), or MosaicHunter, or MuTect2 single mode followed by MosaicForecast or DeepMosaic.

Passed variants were further annotated with a pipeline we previously described, and information including the COSMIC89, gnomAD genome, avsnp150, CADD 1.3, eigen value, and fathmm by ANNOVAR command ./table_annovar.pl input.avinput /humandb/ -buildver hg19 -out output_annotated -remove -protocol refGene,gnomad_genome,avsnp150,cosmic89,cadd13,eigen,fathmm -operation g,f,f,f,f,f,f -nastring .


4. Validation with TASeq

Codes and strategies for TASeq are available on GitHub.


4. Plotting

Codes and inputs for the plots in the project.

Oncoplot

oncoplot.ipynb describes codes for oncoplot presented in Fig.1e
oncoplot.maf contains the source data.

Genotype-phenotype association

Genotype_phenotype_association.ipynb contains R codes for Fig.4.

Supplementary Table 4 is the source data for the association analyse.

Single-nuclei RNA sequencing (snRNAseq) in the fetal cortex from Nowakowski et al 2017 Science

Fetal_cortex_snRNAseq.ipynb contains R codes used for generating Fig.5.

snRNAseq in MCD brain tissues

MCD_snRNAseq.ipynb contains R codes used for generating Fig.6, Extended Data Fig.7, Extended Data Fig.9, and Supplementary Table 5.


5. Statistical analysis

Codes as well as some intermediate files and scripts for the statistical analysis of the project.


6. Contact:

📧 Changuk Chung: chchung@health.ucsd.edu

📧 Xiaoxu Yang: xiy010@health.ucsd.edu, yangxiaoxu-shishen@hotmail.com

📧 Joseph Gleeson: jogleeson@health.ucsd.edu, or the Gleeson lab gleesonlab@health.ucsd.edu


7. Cite the data and codes:

Chung C & Yang X, et al., Gleeson JG. Comprehensive multi-omic profiling of somatic mutations in malformations of cortical development. 2022. (Nat. Genet., DOI:10.1038/s41558-022-01276-9)

About

Malformations of cortical development (MCD) are neurological conditions displaying focal disruption of cortical architecture and disrupted cellular organization that occurs during embryogenesis. This repository contains the pipelines for data processing, codes for data analysis and plotting for the large-scale MCD data analysis.

https://www.nature.com/articles/s41588-022-01276-9

License:GNU General Public License v3.0


Languages

Language:Jupyter Notebook 99.8%Language:Python 0.1%Language:Perl 0.1%Language:Shell 0.0%Language:R 0.0%