dpolychr / tad_cnes_harmston2017

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Topologically associating domains are ancient features that coincide with Metazoan clusters of extreme noncoding conservation

Nathan Harmston*1,2,3, Elizabeth Ing-Simmons1,2,4, Ge Tan1,2, Malcolm Perry1,2, Matthias Merkenschlager2,4, Boris Lenhard*1,2,5

1 Computational Regulatory Genomics, MRC London Institute of Medical Sciences, London W12 0NN, UK.
2 Institute of Clinical Sciences, Faculty of Medicine, Imperial College London, London W12 0NN, UK.
3 Program in Cardiovascular and Metabolic Disease, Duke-NUS Graduate Medical School, 8 College Road, Singapore 169857, Singapore.
4 Lymphocyte Development, MRC London Institute of Medical Sciences, London W12 0NN, UK.
5 Sars International Centre for Marine Molecular Biology, University of Bergen, N-5008 Bergen, Norway.
* Correspondence should be addressed to NH (nathan.harmston@duke-nus.edu.sg) and BL (b.lenhard@imperial.ac.uk)

Abstract

Developmental genes in metazoan genomes are surrounded by dense clusters of conserved noncoding elements (CNEs). CNEs exhibit unexplained extreme levels of sequence conservation, with many acting as developmental long-range enhancers. Clusters of CNEs define the span of regulatory inputs for many important developmental regulators and have been described previously as genomic regulatory blocks (GRBs). Their function and distribution around important regulatory genes raises the question of how they relate to 3D conformation of these loci. Here, we show that clusters of CNEs strongly coincide with topological organisation, predicting the boundaries of hundreds of topologically associating domains (TADs) in human and Drosophila. The set of TADs that are associated with high levels of non-coding conservation exhibit distinct properties compared to TADs devoid of extreme non-coding conservation. The close correspondence between extreme noncoding conservation and TADs suggests that these TADs are ancient, revealing a regulatory architecture conserved over hundreds of millions of years.  

Scripts for reproduction of results and figures from manuscript

All figures and results in the manuscript can be reproduced from the R scripts within this repository.

Script Main figures Supplementary figures
plot_grbs_figure1.Rmd Figure 1 FigureS1
plot_grbs_figureS2.Rmd Figure S2
calculate_grb_tad_overlaps_human.Rmd Figure 2 Figure S3
calculate_grb_tad_overlaps_fly.Rmd Figure 2 Figure S3
calculate_pvalue_distances.Rmd Figure S3
grbs_h3k27ac.Rmd Figure S4
ContactDomains.Rmd Figure S5
plot_grbs_figure3.Rmd Figure 3
plot_grbs_figureS6.Rmd Figure S6
repeat_analysis.Rmd Figure 4 Figure S8
chromatin_colour.Rmd Figure 4 Figure S8
ctcf_analysis.Rmd Figure 4 Figure S8
dev_analysis.Rmd Figure 4 Figure S9
genome_comparison.Rmd Figure 5

About

License:GNU Affero General Public License v3.0