james-guevara/AnnotationDatabase

Annotation Database

USAGE: Build Annotation For human Genome or Epigenome

Timeline:

2020/02/26: NHLBI Exome Sequencing Project (ESP), Exome Variant Server: https://evs.gs.washington.edu/EVS/
2020/01/12: update TCGA-pan-meta differential and survival analysis, not accurate since no confounding adjust
don't forget to add ENSG in the output files since one SNP will have multiple records with different genes
only SNPs, P-value were kept in 49 eQTL v8.signif_variant_gene_pairs.txt so that files <25M
majority of the time is used to read dbSNP153.bin.chain.txt into memmory and loop eqtl is very fast
dbSNP153.bin.chain.txt is 21G and will be 105G in the CHG1 memory and takes 25 minutes with add rsid perl script
2020/01/01: add rsid to GTEx v8.signif_variant_gene_pairs.txt files to match with m6A-Var, script and result
2020/01/01: GWAS-Catalog Updated to hg38 from hg19.

Basic information should be included:

1, genomic assemble version: hg19, hg38, mm9, mm10

2, bedGraph format would be perfect for usage

3, PBMC or GEO ID should be recorded.

4, Basic annotation information and method

5, Basic study design and sample size and populations (China, European or US)

annotation database for human genetics, genomics and epigenomics

Language:Shell 51.4%Language:Perl 33.0%Language:R 13.6%Language:Prolog 1.5%Language:Raku 0.5%Language:Roff 0.1%