shengxinzhuan / Populus_genomic_prediction_climate_vulnerability

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Populus_genomic_prediction_climate_vulnerability

Scripts for Sang et al (2022) Genomic insights into local adaptation and future climate-induced vulnerability of a keystone forest tree in East Asia. Nature Communications, accepted

1.Genome analyses

(1) TE annotation: we performed command ‘perl EDTA.pl --genome genome.fasta --sensitive 1 -anno 1’ implemented in EDTA software to annotate TE.

(2) 1hisat_samtools_run.sh - Script to align RNA-seq file to reference.

(3) 2Trinity_GG_denovo.sh - Script to perform rna-seq data assmble based on both Denovo and genome-guided methods using Trinity software.

(4) 3PASA_align_run.sh - Script to run PASA.

(5) 4ab_homo_pipe.sh - Script to perform ab initio and homologous prediction. Augustus and TBLASTN were performed in the pipeline geta https://github.com/chenlianfu/geta

(6) 5evm_run.sh - Script to integrate ab initio, transcriptome-based and homology-based evidences to the final consensus gene set using EvidenceModeler.

(7) 6PASA_update.sh - Script to update alternatively spliced isoforms.

2.Variant-calling

Call_SNP

(1) 1trimmomatic.sh - Script to use Trimmomatic to filter raw reads.

(2) 2bwa&picard.sh – Script to use BWA to map reads to reference genome, use SAMtools to sort the alignment results and use Picard to mark PCR duplicate.

(3) 3GATK-HaplotypeCaller.sh,4GATK-Combinegvcf.sh and 5GATK-GenotypeGVCFs.sh - Scripts to use GATK to perform SNP and Indel calling.

(4) 6summary-coverage_ratio.sh,6summary-depth.sh,6summary-mapping_ratio.sh – Scripts to summary statistics of whole genome re-sequencing data.

(5) filter – Scripts to performe multiple filtering steps to only retain high-quality SNPs for downstream analysis.

(6) filter/12SNPfilter5-bedfile_create - Script to use SNPable to mask the genome.

Call_Indel

(1) 1seperate-Indel.sh - Script to use Vcftools to separate Indel from raw vcf-format files.

(2) filter - Scripts to performe multiple filtering steps to only retain high-quality Indels for downstream analysis.

Call_SV

(1) 1call_SV1.sh,2call_SV2.sh,3call_SV3.sh – Scripts to use Delly to call SV.

(2) 4merge_filter.sh – Script to performe multiple filtering steps to only retain high-quality SVs for downstream analysis.

3.Population_genetics

(1) 1structure.sh - Script to perform structure analyses.

(2) 2NJ.sh - Script to perform phylogenetic analyses.

(3) 3pi-24pops.sh - Script to calculate genetic diversity of each population.

(4) 4pi&dxy_NS.sh - Script to calculate genetic diversity of intra-group and inter-group.

(5) 5Fst&TajimaD.sh - Script to calculate Fst and Tajima’s D of each group.

(6) 6IBD&IBE.sh - Script to perform IBD, IBE, pIBD and pIBE analyses.

(7) 7PSMC - Script to perform PSMC analyses.

(8) 8LD.sh – Script to estimate and compare the pattern of LD among different groups.

4.Local adaptation

(1) 1LFMM_onebio.sh - Script to perform LFMM analyses.

(2) 2manhattan.sh - Script to convert pvalue to qvalue and make manhatton plots.

(3) 3cor_plot.R - Script to calculate the correlation between environmental variables.

(4) 4RDA.R - Script to perform RDA analyses.

(5) 5pk_beagle_eff.sh - Script to annote the vcf-format file.

(6) 6IHS.sh - Script to calculate iHS value.

(7) 7supp_picture_plot.R - Script to make Supplementary Fig. 12.

5.Call_ATAC_peak

(1) 1.atac_trim_reads.sh - Script to use Trimmomatic to filter raw reads.

(2) 2.atac_index.sh - Script to use Bowtie2 to build the reference index.

(3) 3.atac_bowtie_rmdup.sh - Script to use Bowtie2 to map reads to reference genome, use SAMtools to sort the alignment results and use Picard to remove PCR duplicates.

(4) 4.atac_insertsize.sh - Script to summary insrtsize of ATAC data.

(5) 5.macs_call_peak.sh - Script to use MACS2 to call ATAC peak and gain the location of peaks.

6.RONA

(1) 1RONA_calculate.R - Script to calculate RONA of each climate model.

(2) 2aveRONA_plot.R - Script to make plots using ave_RONA.

(3) 3MODELS_cor.R - Script to calculate the correlation among the RONA calculated by 4 climate models.

(4) 4RONA_ave_weightedSE.R - Script to calculate the SE across 4 climate models.

7.Genetic offset

(1) 1GF.R - Script to make GFanalyses (environment_variables-RANK;PC-plot;offset-calculate).

(2) 2 genetic_offset-plot.R - Script to make genetic_offset plots. (19 climate variables for example)

(3) 3migrant_offset.R - Script to assess the metrics of forward, reverse and local genetic offset.

(4) 4future_forward_dist.R - Script to estimate the forward genetic offset by assuming that populations have different migration capacity (100km,250km,500km,1000km,unlimited).

(5) 5offset_cor.R - Script to calculate the correlation among the genetic offset calculated by 4 climate models.

About


Languages

Language:R 55.4%Language:Shell 44.6%