predicting a single sample
topepo opened this issue · comments
Using version 0.2 I have issues when predicting a new data set with a single row:
library(caret)
set.seed(1)
dat <- twoClassSim(101)
trn <- dat[1:100,]
tst <- dat[101,]
library(sparsediscrim)
mod <- hdrda(x = as.matrix(trn[, -ncol(trn)]), y = trn$Class)
predict(mod, newdata = as.matrix(tst[, -ncol(tst)]))
with
predict(mod, newdata = as.matrix(trn[1:5, -ncol(tst)]))
$class
[1] Class1 Class1 Class1 Class1 Class2
Levels: Class1 Class2
$scores
Class1 Class2
1 9.539882 13.34303
2 15.849269 27.26078
3 22.623988 27.86927
4 19.998993 22.87425
5 26.780945 12.71985
$posterior
Class1 Class2
1 1.000000e+00 2.230046e-02
2 1.000000e+00 1.106739e-05
3 1.000000e+00 5.272328e-03
4 1.000000e+00 5.640160e-02
5 7.822473e-07 1.000000e+00
This examples throws an error "Error in which.min(scores) : (list) object cannot be coerced to type 'double'
".
In other cases (data not available) it gives posteriors that don't add to one or results with >1 dimension:
Browse[2]> predict(modelFit, newdata)
$class
[1] Class1
Levels: Class1 Class2
$scores
Class1 Class2
2.345889 2.427533
$posterior
Class1 Class2
1.0000 0.9216
and:
predict(modelFit, newdata)
$class
[1] Class2 Class2 Class2 Class2 Class2 Class2 Class2 Class2 Class2 Class2 Class2 Class2 Class2 Class2 Class2 Class2
Levels: Class1 Class2
$scores
Class1 Class2
[1,] 2.427533 2.345889
[2,] 1.427533 1.345889
[3,] 1.427533 1.345889
[4,] 1.427533 1.345889
[5,] 1.427533 1.345889
[6,] 2.427533 2.345889
[7,] 1.427533 1.345889
[8,] 1.427533 1.345889
[9,] 1.427533 1.345889
[10,] 1.427533 1.345889
[11,] 2.427533 2.345889
[12,] 1.427533 1.345889
[13,] 1.427533 1.345889
[14,] 1.427533 1.345889
[15,] 1.427533 1.345889
[16,] 2.427533 2.345889
$posterior
Class1 Class2
[1,] 0.9216 1
[2,] 0.9216 1
[3,] 0.9216 1
[4,] 0.9216 1
[5,] 0.9216 1
[6,] 0.9216 1
[7,] 0.9216 1
[8,] 0.9216 1
[9,] 0.9216 1
[10,] 0.9216 1
[11,] 0.9216 1
[12,] 0.9216 1
[13,] 0.9216 1
[14,] 0.9216 1
[15,] 0.9216 1
[16,] 0.9216 1
Thanks,
Max
Thanks for letting me know, @topepo. Missed your issue somehow. I'll take a look right now.
In the latest version (0.2.2) on master, the error with predicting a single sample is no longer present. A couple of issues still remain:
- The posterior probabilities do not sum to 1
- The class names are renamed when predicting a single sample.
I'm looking into both issues.
Side note: the latest version of sparsediscrim
on CRAN is 0.2. I'll update it on CRAN after the fix.
> library(caret)
> set.seed(1)
> dat <- twoClassSim(101)
> trn <- dat[1:100,]
> tst <- dat[101,]
> mod <- hdrda(x = as.matrix(trn[, -ncol(trn)]), y = trn$Class)
> predict(mod, newdata=trn[1, -ncol(tst)])
$class
[1] Class1
Levels: Class1 Class2
$scores
Class1.1 Class2.1
9.539882 13.343029
$posterior
Class1.1 Class2.1
1.00000000 0.02230046
Thanks for reporting the issue, @topepo. Resolved.
I'll push to CRAN soon.
Thanks for the fix.
Max