sablokgaurav

Gaurav Sablok's repositories

python_analytics_classes

A collection of multiple python class codes for several purposes from the plots, to data structures and datascience and engineering

Language:Jupyter NotebookMIT2 10

fungal_metagenomics_ITS_coverage_calculator

I coded this function to estimate the fraction of the ITS predictions from the fungal metagenomics and it estimates by taking into account the sequence length and also the ITS1 and ITS2 start and stop coordinates. Provided a keyworded argument, it estimates the coverage accordingly

Language:PythonMIT1 10

large_scale_genomic_alignment_extraction

A scalable large scale genomic fraction aligner and extractor for the large scale alignment of the genomes and the transcriptomes and process them over the cores for the extraction of the aligned regions. The aligned regions can also be mapped to the length plotter and can be machine trained for specific applications

Language:PythonMIT1 10

linear_regression_training_model_based_on_sequence_characteristics

I coded this linear regression based training model based on the sequence features across the sequences. It has two arguments, just train the model or train and predict the model

Language:PythonMIT1 10

numpy_shell_builder

A numpy shell builder to extract and how to use the numpy across the arrays.I am putting the entire manual for those who like to search immediately rather than looking here and there.

Language:ShellMIT1 10

pacbio_oxford_nanopore_repeat_coverage

a long read repeat coverage calculator,given an long read file before assembly either direct from the sequencing runs or after the cleaning, it calculates the total amount of the repeat stretches present in the sequencing reads and you can plot them before assembly

Language:PythonMIT1 10

python_algorithms_structures_data_structures

This repository contains the codes which i have posted on linkedln solutions for the leetcode, interview query and the codewars questions and i used a different approach as compared to the approach everywhere mentioned

Language:PythonMIT1 20

slurm_pbs_cluster_scripts

SLURM and PBS scripts for Illumina and Long read genome assembly, transcriptome and metagenomics and comparative analysis

Language:ShellMIT1 20

plant_long_read_resistance_gene_isolator

I coded this function to make a comprehensive gene isolation for the plant resistance genes from the long reads sequencing. Given PacBio or Oxford Nanopore Reads, it will assemble, predict the plant disease resistance genes and will allow you to analyze the mutations in the plant disease resistance genes

Language:ShellMIT010

bacterial_disease_model_attributable_fractions

This repository contains the risk model function of the disease model in virbrio infections and how can be modeled to estimate the attributable rates

Language:PythonMIT010

bacterial_plant_fungal_domain_analyzer

This repository contains a datascience based faster implementation of the domain predictions from the interpro scan and it will give you a complete domains information, coordinates and other associative information. I used a mapping dataframe approach to make it faster rather than looping it over and over.

Language:PythonMIT010

bacterial_plant_fungal_domain_directed_graphs

This repository contains a function which will prepare the domain graphs analysis, if you will specify a domain or an interpro, it will give you all the parent and the child graphs for the directed and undirected graphs modelling

Language:PythonMIT010

candida_ontology_network_analyzer

A faster implementation of the gene ontology analyzer for the candida genomes, given the candida go ontology files and a search GO term, it extracts all the alt_id, relationship_ids and associated function with those gene ontology for the network analysis and to link with the expression analysis.

Language:PythonMIT010

cookiecutter

A cross-platform command-line utility that creates projects from cookiecutters (project templates), e.g. Python package projects, C projects.

Language:PythonBSD-3-Clause000

cookiecutter-django

Cookiecutter Django is a framework for jumpstarting production-ready Django projects quickly.

Language:PythonBSD-3-Clause000

cookiecutter-django-rest

Build best practiced apis fast with Python3

Language:PythonMIT000

cookiecutter-flask

A flask template with Bootstrap, asset bundling+minification with webpack, starter templates, and registration/authentication. For use with cookiecutter.

Language:PythonMIT000

dive

A tool for exploring each layer in a docker image

Language:GoMIT000

djangopackages

Django Packages is a directory of reusable apps, sites, tools, and more for your Django projects.

Language:PythonNOASSERTION000

genome_transcriptome_annotation_make

A function to make the genome and the transcriptome annotations to reflect the gene regions and how they should be displayed. Provided a gff file and asked annotation and other columns it uses pygenomeviz to make all the annotation maps.

Language:PythonMIT010

genomics_datascience_quick_bash

This repository has been made to assist you in writing the bash based workflow and this includes how to do normal BASH based task and how to develop and deploy workflows on the cluster

Language:ShellMIT010

longread_bcftools_filter

making bcftools filtering easy. bcftools_filter which will allow for the faster filtering of the variant calls according to the allelic depth and the tags using simple to overlap approaches as compare to implementing the regular patterns.

Language:PythonMIT010

odd_ratio_estimator_from_specific_geographical_location

This function will take a data frame of the outbreak and will predict the odd ratios and the specific likelihood of occurrence of the disease in that specific geographical location

Language:PythonMIT010

pbs_altair_pro_bash_manual

This repository contains the code for the PBS Altair Pro at CHPC and you can save this code ending with .sh and run the script as .sh and you dont have to remember the PBS Pro manual.

Language:ShellMIT010

phytozome_pacid_fetcher

this function takes ids file with the gene of interest and the phytozome gff files and will fetch the pacid for the genes of interest.

Language:PythonMIT010

pubmed_indexer_abstract_fetcher

This function will prepare the abstract and the id information for all the pubmed articles that you want to read and have as a citation. I coded this using a web scraping approach and it is blazing fast and parses better than ncbi eutils

Language:PythonMIT010

scalable_parallel_faster_genomic_transcriptomics_annotations

This repository contains a scalable and faster implementation for the genome and the transcriptome annotations for large scale sequencing datasets.

Language:PythonMIT010

seagrass_supplementary_seagrassdb

This repository contains the sequence repository for the seagrasses paper transcriptome assembly and database. The server is down, please use the files for the further analysis such as BLAST and comparative analysis

Language:PythonGPL-3.0010

tair_pubmed_connector

There is no function to fetch automatically the information on the reported pubmed articles links in the tair to be used for the language models, so i coded this function which will take the tair information, a gene or locus tag and will fetch the corresponding pubmed and then from the pubmed the corresponding abstracts

Language:PythonMIT010

ZSH_POSH_web_scrapping

ZSH_POSH_web_scrapping: This repository contains the bash based web scrapping if you want to install the nerd fonts for programming

MIT010