GenomeInfo.R
is sourced by R scripts in each directory, provides consistent name variables, and contains the full paths to genome fastas and TE annotations (most of the time - some paths are hard coded in the following directories)
age_model/
predicts age using a random forest modelbase_composition/
finds methylatable sequence in TE and flanking regionsdiversity/
finds number of segregating sites within each TE copy, and flanking sequencefigures/
has code for generating manuscript figuresgenes/
finds closest genes, and their expression across developmental atlasesmethylation/
finds DNA and histone methylation of TEs and flanking regionsmnase/
finds overlap of TEs and flank to MNase hypersensitive regionsrecombination/
identifies the recombinational environment the TE is found insubgenomes/
assigns each TE to a subgenomete_age/
determines the age of each TE copyte_characteristics/
identifies features specific to the TE copy encoded within the gff, like TE lengthte_expression/
calculates the per-copy TE expression level across developmental atlaseste_genes/
identifies TE genes within each TE model