sablokgaurav

Gaurav Sablok's repositories

expression_deep_neural_network

a deep neural expression based classifier demonstrated to fit a unbalanced dataset, expression datasets across the samples

Language:RMIT1 10

plant_bacterial_computational_KO_phylogenemoics

This repository contains the Python R and Java code which i coded using the mathematical expression for the genome based ontologies annotation and the phylogenomics informativeness

Language:PythonMIT1 10

plant_microarray_analysis

analysis of already normalized microarray expression profiles and perform batch analysis and plots the volcano plots and differential expression

Language:RMIT1 10

plant_resistance_gene_fetcher

i coded this a custom function to fetch the dna and the protein sequence from the plant resistance gene database and get the corresponding dna_sequence and the protein_sequence.

Language:PythonMIT1 10

tairaccession

a python package for working with the tair, phytozome and conversions and also the annotation and coordinates checker.

Language:PythonMIT1 10

awk_shell_file_directory_size_plotter_awk

A awk based sort index way to plot the files or the directories across the dockers and intergrate this in your ~/.bashrc or the ~./zshrc or a cron job for managing the disk space across dockers

Language:ShellMIT000

bacterial_insertion_crispr_site_checker

A faster implementation of the string search for the check of the insertion elements and the CRISPR sites if present in the genome string and then clip those insertion sites and get a clean genome.

Language:RMIT000

bacterial_tolerance_rate_support_vector_machine

a support vector based machine learning to predict the tolerance rates in the bacterial infections. It uses eps-regression and although the c-type classification can be applied if you want to predict the time variable

Language:RMIT000

diff_alternative_data_structure_R

I read this post today and they mentioned the diff which i have used a lot in R but i want to put this git just to show that you can also do this from a data structure point of view

Language:RMIT010

gene_annotation_count_arguably

I implemented the arguably with a function to calculate the genome annotation for the microbiome and also for the other genomes. It will take a genome annotation or a text file and will prepare the count and also for the gene ontology analysis

Language:PythonMIT010

genome_annotation_clean

A parallel encoded cluster computing Genome annotation cleaner that will take a genome annotation file and will clean them for the annotations and prepares for the machine learning

Language:ShellMIT020

genotyping_platform_prepare

This repository contains a custom function which can be used to prepare the files for the genotyping or the sequencing. You can specify the path and the fasta files and mark them according to the desired condition for the genotyping or sequencing

Language:PythonMIT010

gitMaker

A ruby class that will do all the tasks for the git initialize, commit, push, generating the git tokens and committing to specific branches

Language:RubyMIT000

linear_regression_bounded_memory_linear_regression

fitting a linear regression on the height and the bolting time of the lettuce phenotypes to see if there can be a linear regression to be established

Language:PythonMIT000

metagenomics_abundance_normalize

a metagenomics abundance normalizer which will take the abundance OTUs file and gives you a normalized ratio for plotting of the species

Language:PythonMIT000

MiSeq-NextSeq-NovaSeq_genome_shell_assembler

A pure shell assembler that takes only the directory path and does all the cleaning of the reads, mapping, remapping and assembly. From start to finish everything by providing a simple directory path. It works with MiSeq, NextSeq, NovaSeq

Language:ShellMIT000

pangenomeMetagenomicsNormalizer

a pangenome metagenomics normalizer, given a gene ontology based presence and absence and a species file, it first summarizes the count across the species and then takes the count of the gene ontologies and present a ratio The higher the ratio the more presence of that ontology across the species.

Language:PythonMIT010

pbs_backup_simulator

This repository contains the code for the PBS backup simulator and you can run your code with in the PBS simulator to avoid any breakage and system configurations loss

Language:PythonMIT010

pbs_configure_python_function

This repository contains two custom functions that will prepare the PBS files for your cluster computing. Simply call the function and it will ask for the parameters and then it will output the complete PBS file so that you can submit to the cluster

Language:PythonMIT010

pbs_configure_R

This version contains the R code for the PBS users so that they can invoked the R session and submit the same on the PBS clusters

Language:RMIT010

plant_resistance_gene_logistic_regressor

an application of the logistic regressor for the plant disease resistance genes. Given a fasta file and the corresponding expression file and a motif types which you think are associated with the plant disease resistance, if prepares the classification datasets and then fits a logistic regressor for the model building.

Language:PythonMIT000

plant_resistance_gene_miner

I coded this plant resistance gene miner which uses a regular expression plus a web scrap approach and given a resistance gene id, it will return the genbank id

Language:PythonMIT000

ruby_genome_annotation_iterator_large_scale

A genome annotation length calculator written in ruby. It invokes the shell subprocess with in ruby to parse the iterators at the faster rate. if you have dozens of genome sequenced, simply mention the column number and the iterator will hash the length. added support for the features as

Language:RubyMIT000

ruby_on_rails_app_for_genomic_trait_analysis_genome

This repository contains a complete ruby on rails application for the development and analysis of genomic traits for the sequenced genomes and how it can be deployed for the machine learning. It integrates sqlite3 and postgresql as a backhand and uses bootstrap for the custom appearance.

000

rust_based_docker_containerization_arrays

I applied nushell rust programming approach to docker containerization and created arrays from the same. A fresh way to view the docker containerization

Language:ShellMIT000

shell_plotter

A shell plotting function that extends a ruby framework and plots the metagenomics abundances right in your shell for checking the abundance distribution

Language:ShellMIT000

tair_gff_ids

A set of functions which will provide easy access and cleaned gff from tair and uses a dataframe and datascience approach to get the systematic tair ids and their coordinates from the tair 10 gff version. It can be applied to any version of the tair for getting the systematic retrival of the tair ids.

Language:PythonMIT010

transdecoder_trinity_assembly_visualization

A regular expression based trinity assembly transdecoder predictions encoder which will parse and will prepare the transcript annotations for visualization with any genome visualization kit such as pygenomeviz, mauve and others, it prepares the coordinates as tuples

Language:PythonMIT010

warp_bioinformatics_workflows

warp bioinformatics workflows for integration into warp for launching complete workflows on the computing cluster.

Language:Shell000

warp_datascience_workflows

a collection of the warp workflows that i have written for direct integration into warp workflows. You can integrate this into your workflows. Either integrate all of them using the shell or add each workflow independently

Language:Shell000