zachcp / bioinformaticstoolkit

Build and deploy cross platform bioinformatic utilities with Rust.

Home Page:https://github.com/zachcp/bioinformaticstoolkit

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

The Bioinformatics Toolkit

RUST-backed utilities for bioinformatic data processing.

Get started

The fastest way to get started it to download the applications found in the Releasehttps://github.com/zachcp/bioinformaticstoolkit/releases section. This project aims to demonstrate how the Rust toolchain enables efficient cross-platform support for high-performance applications. By using Tauri you can write the entire frontend in any tool that compiles to HMLT+Javascript, in this case I used Quarto to take advantage of its simple composition (its mostly markdown +yaml) as well as it's built-in use of the observable runtime.

Screenshots

Below are screenshots of a native application demonstrating the home page, the guide page, an example RNA secondary strucutre visualization using rnapkin;statistics of a fasta file including a histrogram of sequence lengths using noodles for IO; and DNA translation using the protein_translation crate.

Develop

# assuming quarto and cargo are installed and on your path.
git clone https://github.com/zachcp/bioinformaticstoolkit.git
cd bioinformaticstoolkit

# install the tauri cli
cargo install tauri-cli

# add cargo bind dir to the path
export PATH=$PATH:~/.cargo/bin/

# to develop 
cargo-tauri dev

# to package. this build is ~8MB. 
cargo-tauri build

# to test
cd src-tauri && cargo test
# or verbose
cd src-tauri && cargo test -- --nocapture

Other Ideas/Tools for Rust Incorporation

FASTX:

  • convert fasta to fastq
  • basic stats of fasta/fastq
  • histrogram of read lengths (possibly set max number)
  • merge PE reads // split interleaved
  • splitting into multiple files ( create directory ?)
  • filter-fastx length // quality
  • sample the fasta/x files
  • plot: length x quality metrics ( optional hexagon plots )
  • plot: coverage by location.

GFA:

  • Utilites from GFATK including filtering
    • GFAStats

DNA Analysis:

VCF:

  • convert
  • concat
  • split

RNA Secondary Structure:

rna-seq: - [ ] gencounts https://github.com/NKI-GCF/gensum - [ ] rust-lapper https://crates.io/crates/rust-lapper

Taxonomy:

  • load and display a tree file
  • load and display kraken
  • load and display bracken

Peptides and Proteomics:

Rust Software:

Miscelleaneous:

About

Build and deploy cross platform bioinformatic utilities with Rust.

https://github.com/zachcp/bioinformaticstoolkit

License:GNU General Public License v2.0


Languages

Language:Rust 88.5%Language:TypeScript 11.5%