Beast code in Giters

lvn3668's repositories

fastqcparser

Python code to compute adatper content in reads, kmer content, per-base-GC content (at a specific position in a read alignment, against reference genome), per base NC content (at a specific position in a read alignment against the reference genome), per base seq quality (across aligned reads), per base sequence content, per base quality scores, per tile sequence quality

Language:Python3 1 1

GeneAbundance

Language:Python2 10

rebase-using-kmp

restriction enzyme cleavage site identifier using kmp algorithm and REBASE

Language:C++2 10

backpropagationneuralnetwork

Language:C1 10

BaselineSurveyResponseSimulator

Simulation of baseline survey response variables to store information in EPIC Number of variables: 495 Code: Java 8 Reads in csv file of variables and value range of permitted responses Use randomization to a) induce error b) generate valid responses, and simulate 1M responses, to use as prototype data to populate i2b2 deployments in support of the MVP project for VABHS

Language:Java1 10

caBIO-load-scripts

ETL scripts to populate caBIO database

1 10

caMODloadscripts

ETL scripts to populate caMOD database with MTB data from Jackson Labs

Language:PLSQL1 10

DNAExtractionModule

Code that polls beckman coulter and stores rack information, and protocols initiated (if any) on samples.

Language:C#1 10

DNAtoProteintranslation

Converts DNA to Protein along 1, 6 (fwd or reverse strand) or user defined frames

Language:C++1 10

findNmerfrequencies

C++ code to calculate nmer frequencies (n= 1 to 6) and write out to file

Language:C++1 10

findPalindromesandInvertedRepeats

Finds palindromes and inverted repeats in DNA Sequences based on user defined inputs

Language:C++1 10

gatkparser

Python package to parse GATK Output and extract summary statistics at mbq 0,10,20,30 and variant evaluation metrics

Language:Python1 10

genemark

1 10

InterferenceEstimation

Java based implementation of an MLE method using chi square test to calculate interference during meiotic crossover (the number of double strand dna breaks that don't result in a crossover)

Language:Java1 10

LaminarFlowHoodModule

Module for tracking tubes and aliquots and assign storage in the freezers ; Part of the MVP specimen processing system VABHS. Prototype

Language:C#1 10

Microarray

Microarray data analysis using R / BioConductor

1 10

naivevariantcaller_ECGR_variantdetection

Python code to detect ECGR Mutations; Takes a reference genome and bunch of reads as input and finds mutations (1-3 bp length) where number of supporting reads greater than 5

Language:Python1 10

oreillyelegantscipy

Language:Python1 10

Phycastats

16sRNA Microbial Profiling R scripts to find most significant OTUs in 16RNA data after data normalization, followed by ordination and clustering and then plotting iTOL

Language:R1 10

picardparser

Language:Python1 10

pileupnotationvariantcaller

Variant caller from pileup notation / samtools alignment

Language:Python1 10

primerDesign

Language:C++1 10

probeDesign

Takes as input FNA file, PTT file, desired probe length, cross-reactivity allowed, overhang

Language:C1 10

RShinyEntrezViewer

Application to view Entrez data (distribution of Hs / Mm genes per chromosome) using RShiny and MongoDB

Language:R1 10

samtoolsparsers

Parses Samtools output and extracts flagstat results such as number of reads that are pass/fail that are properly aligned, etc.

Language:Python1 10

StatisticalAnalysisOfNetworkData

Language:R1 10

tensorflowtutorial

Language:Python1 10

UMBICARB

1.Partial Scripts to process sequence clusters from 16000 microbial genomes to find orthologous protein clusters, using the most representative sequence per cluster 2. Find fold distribution across protein hits from SCOP and ASTRAL 3. Fnd most significant structural hits and perform structure alignment 4. Eliminate LGT in sequence clusters and realign phylogenetic tree for each of the pruned set of sequence clusters (pruned on basis of number of sequences, most representative seuqence not being an LGT, age in reference phylogenetic tree) 5. Correlate gaps in seuqence alignment with gaps in sequence-represenation of structure alignment to test hypothesis that indels cause fold evolution.

Language:Perl1 10

variantAnnotation

Variant annotation of vcf file using exac and vep

Language:Python1 10

KmerCounter

Kmer counter is written in GO Lang v 1.16.5 To install GO on Windows, follow the instructions at https://golang.org/doc/install 4 GO implementation of N-mer counter in DNA sequences which tests for validity of input. It reads in file name (of fasta file) It reads the size length (kmer length) for which counts are desired and writes out to file, counts of all overlapping kmers of size 1 through the specified input. It checks if fasta file is empty amd whether kmer length is specified.

Language:HTML020