telatin / qax

Qiime2 Artifact eXtractor

Home Page:https://telatin.github.io/qax/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

πŸ“¦ Qiime2 Artifact eXtractor

Build Repository Size Latest release BioConda Downloads Available via BioConda docs-badge

πŸ“– Introduction

qax logo

Qiime2 is one of the most popular software used to analyze the output of metabarcoding experiment, and it introduced a unique data format in the bioinformatics scenario: the β€œQiime2 artifact”.

Qiime2 artifacts are structured compressed archives containing a dataset (e.g., FASTQ reads, representative sequences in FASTA format, a phylogenetic tree in Newick format, etc.) and an exhaustive set of metadata (including the command that generated it, information on the execution environment, citations on the used software, and all the metadata of the artifacts used to produce it).

While artifacts can improve the shareability and reproducibility of Qiime workflows, they are less easily integrated with general bioinformatics pipelines, and even accessing metadata in the artifacts requires the full Qiime2 installation (not to mention that every release of Qiime2 will produce incompatible artifacts). Qiime Artifact Extractor (qxa) allows to easily interface with Qiime2 artifacts from the command line, without needing the full Qiime2 environment installed.

Citation

If you use this tool, please cite

Telatin A (2021) Qiime Artifact eXtractor (qax): A Fast and Versatile Tool to Interact with Qiime2 Archives. BioTech 10: 5. Available: (doi.org/10.3390/biotech10010005)[http://dx.doi.org/10.3390/biotech10010005]

πŸ’Ύ Download and installation

Pre-compiled binaries are the fastest and easiest way to get qax. To get the latest version, use the following command, otherwise check the stable releases.

# From linux
wget "https://github.com/telatin/qax/raw/main/bin/qax"
chmod +x qax

# From macOS
wget -O qax "https://github.com/telatin/qax/raw/main/bin/qax_mac"
chmod +x qax

Alternatively, you can install qax from BioConda, if you have conda installed:

conda install -c conda-forge -c bioconda qax

πŸ“– Usage

qax has four subprograms (general syntax is qax [program] [program-arguments]):

  • list (default): list artifact(s) properties
  • citations: extract citations in BibTeX format
  • extract: extract artifact data files
  • provenance: describe artifact provenance, or generate its graph
  • view: print the content of an artifact (eg. dna-sequences.fasta) to the terminal

πŸ“„ list

This is the default module, and can be used to list the properties of one or more artifacts.

Some features:

  • Supports multiple files at once
  • 100X times faster than Qiime2
  • Can be used to find an artifact given the ID

Example:

qax_mac -b -u input/*.*
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ ID                        β”‚ Basename       β”‚ Type                    β”‚ Format                      β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ bb1b2e93-...-2afa2110b5fb β”‚ rep-seqs.qza   β”‚ FeatureData[Sequence]   β”‚ DNASequencesDirectoryFormat β”‚
β”‚ 313a0cf3-...-befad4ebf2f3 β”‚ table.qza      β”‚ FeatureTable[Frequency] β”‚ BIOMV210DirFmt              β”‚
β”‚ 35c32fe7-...-85ef27545f00 β”‚ taxonomy.qzv   β”‚ Visualization           β”‚ HTML                        β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸ“„ extract

This program extract the content of an artifact. By default, if a single file is present it will be extracted in the specified path. If multiple files are present, a directory containing them will be created instead.

Example:

# Extract representative sequences (will be called rep-seqs.fasta)
qax x -o ./ rep-seqs.qza

# Extract a visualization (a folder called "taxonomy" will be created)
qax x -o ./ taxonomy.qzv

πŸ“„ citations

Each Qiime module provides the citations for the software and resources that it uses, storing the citations in BibTeX format inside the artifacts. The cite module allows to extract all the citations from a list of artifacts, removing the duplicates, thus effectively allowing to prepare the bibliography for a complete Qiime2 analysis.

Example:

qax c files/*.qza > bibliography.bib

πŸ“„ provenance

This program allows to print the provenance of an artifact, or to produce a publication grade graph of the provenance.

Example:

# To view a summary
qax p taxonomy.qzv

# To save the plot
qax p -o graph.dot taxonomy.qza

πŸ“„ view

This program allows to print the content of an artifact data file to the terminal. If the artifact contains a single file, it will be printed. Otherwise the user can specify one or multiple files to be printed, and if none is specified, a list of files will be printed.

# Example: count the number of representative sequences
qax view rep-seqs.qza | grep -c '>'

πŸ“„ make

To create a visualization artifact from a folder with a website (index.html must be present).

qax make -o report.qza /path/to/report_dir/

About

Qiime2 Artifact eXtractor

https://telatin.github.io/qax/

License:GNU General Public License v3.0


Languages

Language:Nim 93.8%Language:CSS 4.2%Language:Shell 1.0%Language:HTML 0.9%