nok / sklearn-porter

Transpile trained scikit-learn estimators to C, Java, JavaScript and others.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Export Matrix as Vector (SVM and maybe other Models)

gobber opened this issue · comments

Firstly, I wold like to thank the authors of the library, it is really useful.

Most of Java Algebra libraries are based on 1D primitive arrays (probably other languages too) instead of 2D (it is easy to map one to another and the algorithms in 1D are simpler to write). One option is to create a new 1D array and copy the data from the 2D, but it is not a desired approach. Then, I suggest that you provide a way to save the data as a 1D primitive array (more especially a 1D column array). I started doing this in a copy of the repository, but I guess you can do it in a future release.

I have an observation about the SVC template (I guess it should be in another place). When you save a model that has two classes, I guess the use of starts and end arrays are redundant, because coefficients is an ordered array (in the sense that all coefficients of the class zero are before any coefficient of the class one). It means you could change:

...
if (this.clf.nClasses == 2) {
    for (int i = 0; i < kernels.length; i++) {
        kernels[i] = -kernels[i];
    }
    double decision = 0.;
    for (int k = starts[1]; k < ends[1]; k++) {
        decision += kernels[k] * this.clf.coefficients[0][k];
    }
    for (int k = starts[0]; k < ends[0]; k++) {
        decision += kernels[k] * this.clf.coefficients[0][k];
    }            
    decision += this.clf.intercepts[0];            
    if (decision > 0) {
        return 0;
    }
    return 1;
}
...

to:

...
if (this.clf.nClasses == 2) {
    for (int i = 0; i < kernels.length; i++) {
        kernels[i] = -kernels[i];
    }
    double decision = 0.;
    for (int k = 0; k < clf.coefficients[0].length; k++) {
        decision += kernels[k] * this.clf.coefficients[0][k];
    }            
    decision += this.clf.intercepts[0];            
    if (decision > 0) {
        return 0;
    }
     return 1;
}
...

I guess you could improve the case of more then two classes too, merging the structures decisions, votes and amounts.

Best Regards,

Charles

Thanks @gobber , I reimplemented the business logic from libsvm in C to the other programming languages. I will evaluate your suggestions. By success I will add your improvements to all templates.