This Repo contains my own 'study notes' as I learn genomic-scale cloud bioinformatics. It includes descriptions of common tools, platforms and summaries of my work with clients. I update this Repo frequently. It is organized via the folder structure shown below.
- Concepts and Terms (genomics files types, use cases, terminology and also whitepapers)
- Lab Testing (Illumina and more)
- Genomic Tools (GATK, VariantSpark, HAIL and many more - this section updates OFTEN)
- Genomics Platforms (Terra.bio, Galaxy Project, IDSeq and others)
- Public Cloud Genomics (Alibaba Cloud, AWS, Azure or GCP). The general approach is to implement a cloud-native Data Lake pattern for scalable genomic analysis. A conceptual rendering of this pattern is shown below.
In addition to this Repo, I have a number of other Repos with cloud bioinformatics information. Also, I've included two of my favorite link aggregator resources here for additional learning.
- my
learn-cloud
Repo - https://github.com/lynnlangit/learning-cloud - my
gcp-for-bioinformatics
open source course - https://github.com/lynnlangit/gcp-for-bioinformatics - my
learn-wdl
open source course - https://github.com/openwdl/learn-wdl - a link Collection : link to Repo (awesome bioinformatics) with large number of curated links for learning about bioinformatics tools and topics
- đź“– Bioinformatics Workbook: link to online course with key bioinformatics concepts
Teri is the impetus for my movement into the world of genomic research. She was diagnosed with breast cancer in 2016. She survived, but suffered a long course of intense and painful treatment due in part to the lack of availability of personalized treatment options at the time of her diagnosis.