📦 Qiime2 Artifact eXtractor

📖 Introduction

Qiime2 is one of the most popular software used to analyze the output of metabarcoding experiment, and it introduced a unique data format in the bioinformatics scenario: the “Qiime2 artifact”.

Qiime2 artifacts are structured compressed archives containing a dataset (e.g., FASTQ reads, representative sequences in FASTA format, a phylogenetic tree in Newick format, etc.) and an exhaustive set of metadata (including the command that generated it, information on the execution environment, citations on the used software, and all the metadata of the artifacts used to produce it).

While artifacts can improve the shareability and reproducibility of Qiime workflows, they are less easily integrated with general bioinformatics pipelines, and even accessing metadata in the artifacts requires the full Qiime2 installation (not to mention that every release of Qiime2 will produce incompatible artifacts). Qiime Artifact Extractor (qxa) allows to easily interface with Qiime2 artifacts from the command line, without needing the full Qiime2 environment installed.

Citation

If you use this tool, please cite

Telatin A (2021) Qiime Artifact eXtractor (qax): A Fast and Versatile Tool to Interact with Qiime2 Archives. BioTech 10: 5. Available: (doi.org/10.3390/biotech10010005)[http://dx.doi.org/10.3390/biotech10010005]

💾 Download and installation

Pre-compiled binaries are the fastest and easiest way to get qax. To get the latest version, use the following command, otherwise check the stable releases.

# From linux
wget "https://github.com/telatin/qax/raw/main/bin/qax"
chmod +x qax

# From macOS
wget -O qax "https://github.com/telatin/qax/raw/main/bin/qax_mac"
chmod +x qax

Alternatively, you can install qax from BioConda, if you have conda installed:

conda install -c conda-forge -c bioconda qax

📖 Usage

qax has four subprograms (general syntax is qax [program] [program-arguments]):

list (default): list artifact(s) properties
citations: extract citations in BibTeX format
extract: extract artifact data files
provenance: describe artifact provenance, or generate its graph
view: print the content of an artifact (eg. dna-sequences.fasta) to the terminal

📄 list

See qax list full documentation

This is the default module, and can be used to list the properties of one or more artifacts.

Some features:

Supports multiple files at once
100X times faster than Qiime2
Can be used to find an artifact given the ID

Example:

qax_mac -b -u input/*.*
┌───────────────────────────┬────────────────┬─────────────────────────┬─────────────────────────────┐
│ ID                        │ Basename       │ Type                    │ Format                      │
├───────────────────────────┼────────────────┼─────────────────────────┼─────────────────────────────┤
│ bb1b2e93-...-2afa2110b5fb │ rep-seqs.qza   │ FeatureData[Sequence]   │ DNASequencesDirectoryFormat │
│ 313a0cf3-...-befad4ebf2f3 │ table.qza      │ FeatureTable[Frequency] │ BIOMV210DirFmt              │
│ 35c32fe7-...-85ef27545f00 │ taxonomy.qzv   │ Visualization           │ HTML                        │
└───────────────────────────┴────────────────┴─────────────────────────┴─────────────────────────────┘

📄 extract

See qax extract full documentation

This program extract the content of an artifact. By default, if a single file is present it will be extracted in the specified path. If multiple files are present, a directory containing them will be created instead.

Example:

# Extract representative sequences (will be called rep-seqs.fasta)
qax x -o ./ rep-seqs.qza

# Extract a visualization (a folder called "taxonomy" will be created)
qax x -o ./ taxonomy.qzv

📄 citations

See qax citations full documentation

Each Qiime module provides the citations for the software and resources that it uses, storing the citations in BibTeX format inside the artifacts. The cite module allows to extract all the citations from a list of artifacts, removing the duplicates, thus effectively allowing to prepare the bibliography for a complete Qiime2 analysis.

Example:

qax c files/*.qza > bibliography.bib

📄 provenance

See qax provenance full documentation

This program allows to print the provenance of an artifact, or to produce a publication grade graph of the provenance.

Example:

# To view a summary
qax p taxonomy.qzv

# To save the plot
qax p -o graph.dot taxonomy.qza

📄 view

See qax view full documentation

This program allows to print the content of an artifact data file to the terminal. If the artifact contains a single file, it will be printed. Otherwise the user can specify one or multiple files to be printed, and if none is specified, a list of files will be printed.

# Example: count the number of representative sequences
qax view rep-seqs.qza | grep -c '>'

📄 make

To create a visualization artifact from a folder with a website (index.html must be present).

qax make -o report.qza /path/to/report_dir/

telatin / qax

📦 Qiime2 Artifact eXtractor

📖 Introduction

Citation

💾 Download and installation

📖 Usage

📄 list

📄 extract

📄 citations

📄 provenance

📄 view

📄 make

About

Languages