sklearn-porter
Transpile trained scikit-learn models to C, Java, JavaScript and others.
It's recommended for limited embedded systems and critical applications where performance matters most.
Machine learning algorithms
Classification
The portable classifiers are listed in the following table:
Programming language | ||||||
Classifier | C | Java | JavaScript | Go | PHP | Ruby |
sklearn.svm.SVC | X | X | X | X | ||
sklearn.svm.NuSVC | X | X | X | X | ||
sklearn.svm.LinearSVC | X , X | X | X | X | X | X |
sklearn.tree.DecisionTreeClassifier | X | X | X | X | ||
sklearn.ensemble.RandomForestClassifier | X | X | X | |||
sklearn.ensemble.ExtraTreesClassifier | X | X | X | |||
sklearn.ensemble.AdaBoostClassifier | X | X | X | |||
sklearn.neighbors.KNeighborsClassifier | X | X | ||||
sklearn.neural_network.MLPClassifier | X | X | ||||
sklearn.naive_bayes.GaussianNB | X | |||||
sklearn.naive_bayes.BernoulliNB | X |
Installation
pip install sklearn-porter
If you want the latest bleeding edge changes, you can install the module from the master (development) branch:
pip uninstall -y sklearn-porter
pip install --no-cache-dir https://github.com/nok/sklearn-porter/zipball/master
Usage
Either you use the porter as imported module in your application or you use the command-line interface.
Module
This example shows how you can port the decision tree model from the official user guide to Java:
from sklearn.tree import tree
from sklearn.datasets import load_iris
from sklearn_porter import Porter
# Load data and train a classifier:
X, y = load_iris(return_X_y=True)
clf = tree.DecisionTreeClassifier()
clf.fit(X, y)
# Port the classifier:
result = Porter(language='java').port(clf)
print(result)
The transpiled result matches the official human-readable version of the model.
Command-line interface
This examples shows how you can port a model from the command line. First of all you have to store the model to the pickle format:
from sklearn.tree import tree
from sklearn.datasets import load_iris
from sklearn.externals import joblib
# Load data and train a classifier:
X, y = load_iris(return_X_y=True)
clf = tree.DecisionTreeClassifier()
clf.fit(X, y)
# Save the classifier:
joblib.dump(clf, 'model.pkl')
After that the model can be ported by using the following command:
python -m sklearn_porter --input <pickle_file> [--output <destination_dir>] [--language {c,go,java,js,php,ruby}]
python -m sklearn_porter -i <pickle_file> [-o <destination_dir>] [-l {c,go,java,js,php,ruby}]
The following commands have all the same result:
python -m sklearn_porter --input model.pkl --language java
python -m sklearn_porter -i model.pkl -l java
By changing the language parameter you can set the target programming language:
python -m sklearn_porter -i model.pkl -l c
python -m sklearn_porter -i model.pkl -l go
python -m sklearn_porter -i model.pkl -l java
python -m sklearn_porter -i model.pkl -l js
python -m sklearn_porter -i model.pkl -l php
python -m sklearn_porter -i model.pkl -l ruby
Finally the following command will display all options:
python -m sklearn_porter --help
python -m sklearn_porter -h
Development
Environment
Install the required environment modules by executing the bash script sh_environment.sh or type:
conda config --add channels conda-forge
conda env create -n sklearn-porter python=2 -f environment.yml
Furthermore you need to install Node.js (>=6
), Java (>=1.6
), PHP (>=7
), Ruby (>=1.9.3
) and GCC (>=4.2
) for testing.
Testing
Run all tests by executing the bash script sh_tests.sh or type:
source activate sklearn-porter
python -m unittest discover -vp '*Test.py'
# N_RANDOM_TESTS=30 python -m unittest discover -vp '*Test.py'
source deactivate
The tests cover module functions as well as matching predictions of ported models.
Questions?
Don't be shy and feel free to contact me on Twitter or Gitter.
License
The library is Open Source Software released under the MIT license.