chagaz / kover

Learn interpretable computational phenotyping models from k-merized genomic data

Home Page:http://aldro61.github.io/kover/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

DOI

Kover is an out-of-core implementation of the Set Covering Machine algorithm that has been tailored for genomic biomarker discovery. It produces highly interpretable models of phenotypes. The models are rule-based and rely on the presence/absence of k-mers.

Introduction

Drouin, A., Giguère, S., Déraspe, M., Marchand, M., Tyers, M., Loo, V. G., Bourgault, A. M., Laviolette, F. & Corbeil, J. (2016). Predictive computational phenotyping and biomarker discovery using reference-free genome comparisons. BMC Genomics, 17(1), 754. [PDF]

The identification of genomic biomarkers is a key step towards improving diagnostic tests and therapies. We present a new reference-free method for this task that relies on a k-mer representation of genomes and a machine learning algorithm that produces intelligible models. The method is computationally scalable and well-suited for whole genome sequencing studies. The method was validated by generating models that predict the antibiotic resistance of C. difficile, M. tuberculosis, P. aeruginosa and S. pneumoniae. We show that the obtained models are accurate and that they highlight biologically relevant biomarkers, while providing insight into the process of antibiotic resistance acquisition. Kover, an efficient implementation of our method, can readily scale to large genomic datasets. It is open-source and can be obtained from http://github.com/aldro61/kover.

Video lecture:

Interpretable Models of Antibiotic Resistance with the Set Covering Machine Algorithm, Google, Cambridge, Massachusetts (February 2017) [ slides ]

Google tech talk

Installation

For installation instructions, see: http://aldro61.github.io/kover/doc_installation.html

Example

For an example of use, see: http://aldro61.github.io/kover/doc_example.html

Documentation

The documentation can be found at: http://aldro61.github.io/kover/

Contact

If you need help using Kover or to report any bug, please use Biostars.

About

Learn interpretable computational phenotyping models from k-merized genomic data

http://aldro61.github.io/kover/

License:GNU General Public License v3.0


Languages

Language:Python 62.6%Language:C++ 34.4%Language:CMake 2.1%Language:Shell 0.9%