Frank Austin Nothaft's repositories
fnothaft.github.io
Personal website.
glow
An open-source toolkit for large-scale genomic analysis
TypeSeqHPV
NCI CGR laboratory HPV typing analysis workflows
v2g_data
Gene to variant functional dataset workflows
bdg-utils
General (non-omics) code used across BDG products. Apache 2 licensed.
gatk
Official code repository for GATK versions 4 and up
Hadoop-BAM
Hadoop-BAM is a Java library for the manipulation of files in common bioinformatics formats using the Hadoop MapReduce framework with the Picard SAM JDK, and command line tools similar to SAMtools.
bigdatagenomics.github.io
Web Site for the Big Data Genomics Group
deca
Distributed exome CNV analyzer. Apache 2 licensed.
spark-bam
Load genomic BAM files using Apache Spark
workflows
Toil workflows for bigdatagenomics tools. Apache 2 licensed.
jsr203-hadoop
A Java NIO file system provider for HDFS
bdg-formats
Open source formats for scalable genomic processing systems using Avro. Apache 2 licensed.
toil-lib
A common library for functions and tools used in toil-based pipelines
cannoli-1
Big Data Genomics ADAM Pipe API wrappers for bioinformatics tools. Apache 2 licensed.
mmtf-spark
Methods for the parallel and distributed analysis and mining of the Protein Data Bank using MMTF and Apache Spark.
mmtf-workshop-2017
Structural Bioinformatics Training Workshop & Hackathon 2017
cgcloud
Image and VM management for Jenkins, Spark and Mesos clusters in EC2
conductor
Efficient, distributed downloads of large files from S3 to HDFS using Spark.
toil
Python based pipeline management software for clusters that makes running recursive and dynamically scheduled computations straightforward. So far works with gridEngine, lsf, parasol and on multi-core machines.
corretto
Read error correction utilities.
mango
Visualization tools for genomic data. Apache 2 licensed.