nok / sklearn-porter

Transpile trained scikit-learn estimators to C, Java, JavaScript and others.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

(SVC export) N_vector size of the exported C is not the same with the size of training sample

libo-wu opened this issue · comments

I fount that N_vector size of the exported C is not the same with the size of the training sample.

Method:
I use the sample code on https://github.com/nok/sklearn-porter/blob/stable/examples/estimator/classifier/SVC/c/basics.pct.ipynb
to export the C code.
I split the training and test set of 90%:10% by the following code:

from sklearn.model_selection import train_test_split
irisdata = load_iris()
X=irisdata.data
y=irisdata.target
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.1, random_state=True)
print(X_train.shape,y_train.shape)
print(X_test.shape,y_test.shape)

Output:
(135, 4) (135,)(15, 4) (15,)

Then I train the model:

clf = svm.SVC(C=1.0, gamma = 0.001, kernel = 'rbf', random_state = 0)
clf.fit(X_train,y_train)

Finally I exportthe code:

porter = Porter(clf, language = 'c')
output = porter.export()
print(output)

But I got:

#include <stdlib.h>
#include <stdio.h>
#include <math.h>

#define N_FEATURES 4
#define N_CLASSES 3
#define N_VECTORS 132
#define N_ROWS 3
#define N_COEFFICIENTS 2
#define N_INTERCEPTS 3
#define KERNEL_TYPE 'r'
#define KERNEL_GAMMA 0.001
#define KERNEL_COEF 0.0
#define KERNEL_DEGREE 3

double vectors[132][4] = {{4.4, 3.2, 1.3, 0.2}, {5.4, 3.4, 1.5, 0.4}, {5.0, 3.2, 1.2, 0.2}, {5.0, 3.5, 1.3, 0.3}, {5.5, 4.2, 1.4, 0.2}, {5.1, 3.8, 1.5, 0.3}, {5.3, 3.7, 1.5, 0.2}, {5.2, 3.4, 1.4, 0.2}, {5.1, 3.5, 1.4, 0.3}, {5.7, 3.8, 1.7, 0.3}, {5.0, 3.6, 1.4, 0.2}, {4.8, 3.0, 1.4, 0.3}, {5.1, 3.4, 1.5, 0.2}, {5.5, 3.5, 1.3, 0.2}, {4.8, 3.4, 1.6, 0.2}, {4.8, 3.0, 1.4, 0.1}, {4.7, 3.2, 1.3, 0.2}, {4.6, 3.4, 1.4, 0.3}, {5.1, 3.8, 1.6, 0.2}, {5.4, 3.7, 1.5, 0.2}, {4.9, 3.1, 1.5, 0.2}, {5.2, 4.1, 1.5, 0.1}, {4.4, 3.0, 1.3, 0.2}, {5.2, 3.5, 1.5, 0.2}, {5.1, 3.3, 1.7, 0.5}, {4.9, 3.1, 1.5, 0.1}, {5.7, 4.4, 1.5, 0.4}, {4.5, 2.3, 1.3, 0.3}, {5.0, 3.4, 1.6, 0.4}, {5.0, 3.5, 1.6, 0.6}, ...
......

The
N_VECTORS is 132 instead of 135.

I tried other split ratios and the following are some examples:

Training test ratio training size Exported N_VECTORS
0.5 75 75
0.4 90 89
0.3 105 100
0.2 120 113
0.1 135 132
0.05 142 141
0 150 150