sujit-sahu / bmstdr

This is the repository for the R package bmstdr.

Home Page:https://www.sujitsahu.com

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Error in predict function

Hallo951 opened this issue · comments

The predict function has error with bmstdr. The error is:

"Error in paste(call.f, sep = "")[[3]] : subscript out of bounds"

Here a Example:

library(bmstdr)
library(spTimer)
library(string) 

set.seed(11)
s <- sort(sample(unique(nysptime$s.index), size = floor((length(unique(nysptime$s.index))/100)*20)))
DataFit <- spT.subset(data = nysptime, var.name = "s.index", s = s, reverse = T) 
DataValPred <- spT.subset(data = nysptime, var.name = "s.index", s = s, reverse = F) 

mod <- Bsptime(package = "spTimer", 
                 model = "GPP", 
                 formula = as.formula(y8hrmax ~ xmaxtemp + xwdsp + xrh), 
                 data = DataFit, 
                 n.report = 5, 
                 coordtype = "utm", 
                 coords = 4:5, 
                 scale.transform = "NONE", 
                 g_size = 4,
                 N = 2000, 
                 mchoice = F,
                 plotit = F)

# Spatial prediction using spT.Gibbs output
pred.gp <- predict(mod$fit, tol.dist=0.0, newdata = DataValPred, newcoords = ~ Longitude + Latitude)

# result: Error in paste(call.f, sep = "")[[3]] : subscript out of bounds

A temporary fix is:
mod$fit$call <- c(mod$fit$call,"", str_split_fixed(mod$fit$call, " ", n = 3)[3])

The error is that the predict function cannot divide the regression formula by dependent and independent variables. I do this manually with the fix. The error line (line number 674) is in the data spGPP.r from the package spTimer in github with the Web Adress https://github.com/cran/spTimer/blob/master/R/spGPP.r in the function "spGPP.prediction". The line is call.f<-as.formula(paste("tmp~",paste(call.f,sep="")[[3]])).

It is very important that you fix this in the program code, because this error occurs in all models. My temporary fix is only a stopgap.

`Sujit said:

Thanks for this. But sorry, I have not been able to reproduce the error. I checked both on my linux and Windows machines. I will ask some of my students to double check this. Please can you give me code which results in error.`

My system:
R version 4.2.0 (2022-04-22 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows Server x64 (build 17763)

Version bmstdr: 0.2.2

That's very strange.

I have tried their package on different computers and the error occurs on all of them. The incorrect line for me is the following:

call.f<-as.formula(paste("tmp~",paste(call.f,sep="")[[3]]))

The error is:
Error in paste(call.f, sep = "")[[3]] : subscript out of bounds

Where call.f is the model formula. Does the following work for them:

call.f <- as.formula(y8hrmax ~ xmaxtemp + xwdsp + xrh)

call.f<-as.formula(paste("tmp~",paste(call.f,sep="")[[3]]))

After all, this line does nothing but replace "y8hrmax" with "tmp". And just this causes me the error no matter which computer I test it on...

Do you have an additional package installed or something?

Thank you. But the file spGPP.r is not within bmstdr package. It belongs to spTimer. Please contact the spTimer maintainer.
However, there is an easier solution which worked for me. Just do not attach the library string. I believe that is messing up some function used in the spTimer code.

Please see below the code and the output I produced without error on my Windows computer.

Restarting R session...

Restart R

rm(list=ls())
library(bmstdr)
Loading required package: Rcpp
Registered S3 method overwritten by 'GGally':
method from
+.gg ggplot2
Warning messages:
1: package ‘bmstdr’ was built under R version 4.1.3
2: package ‘Rcpp’ was built under R version 4.1.3
library(spTimer)

spTimer version: 3.3.1

set.seed(11)
s <- sort(sample(unique(nysptime$s.index), size = floor((length(unique(nysptime$s.index))/100)*20)))
DataFit <- spT.subset(data = nysptime, var.name = "s.index", s = s, reverse = T)
DataValPred <- spT.subset(data = nysptime, var.name = "s.index", s = s, reverse = F)
mod <- Bsptime(package = "spTimer",

  •            model = "GPP", 
    
  •            formula = as.formula(y8hrmax ~ xmaxtemp + xwdsp + xrh), 
    
  •            data = DataFit, 
    
  •            n.report = 5, 
    
  •            coordtype = "utm", 
    
  •            coords = 4:5, 
    
  •            scale.transform = "NONE", 
    
  •            g_size = 4,
    
  •            N = 2000, 
    
  •            mchoice = F,
    
  •            plotit = F)
    

Output: GPP approximation models

Sampled: 400 of 2000, 20.00%.
Batch Acceptance Rate (phi): 47.62%
Checking Parameters:
phi: 0.0082, rho: 0.2140, sig2eps: 24.8181, sig2eta: 213.0410
beta[1]: -18.3156 beta[2]: 2.3864 beta[3]: 0.6344 beta[4]: -0.5185


Sampled: 800 of 2000, 40.00%.
Batch Acceptance Rate (phi): 46.18%
Checking Parameters:
phi: 0.0078, rho: 0.2410, sig2eps: 24.8995, sig2eta: 177.1832
beta[1]: -5.5031 beta[2]: 1.9212 beta[3]: 1.1077 beta[4]: -0.2455


Sampled: 1200 of 2000, 60.00%.
Batch Acceptance Rate (phi): 44.37%
Checking Parameters:
phi: 0.0088, rho: 0.3057, sig2eps: 26.5538, sig2eta: 178.2912
beta[1]: -8.7637 beta[2]: 1.9623 beta[3]: 1.2766 beta[4]: -0.8588


Sampled: 1600 of 2000, 80.00%.
Batch Acceptance Rate (phi): 44.97%
Checking Parameters:
phi: 0.0102, rho: 0.2496, sig2eps: 25.4951, sig2eta: 180.1180
beta[1]: -23.6339 beta[2]: 2.3384 beta[3]: 1.1674 beta[4]: 0.6791


Sampled: 2000 of 2000, 100.00%.
Batch Acceptance Rate (phi): 44.22%
Checking Parameters:
phi: 0.0090, rho: 0.1954, sig2eps: 26.7600, sig2eta: 194.2956
beta[1]: -19.1704 beta[2]: 2.3443 beta[3]: 0.7325 beta[4]: 0.2659

nBurn = 1000 . Iterations = 2000 .

Acceptance rate: (phi) = 44.2 %

Elapsed time: 1.42 Sec.

Model: GPP

Total time taken:: 1.46 - Sec.

Spatial prediction using spT.Gibbs output

pred.gp <- predict(mod$fit, tol.dist=0.0, newdata = DataValPred, newcoords = ~ Longitude + Latitude)

Prediction: GPP models

Sampled: 100 of 1000, 10.00%


Sampled: 200 of 1000, 20.00%


Sampled: 300 of 1000, 30.00%


Sampled: 400 of 1000, 40.00%


Sampled: 500 of 1000, 50.00%


Sampled: 600 of 1000, 60.00%


Sampled: 700 of 1000, 70.00%


Sampled: 800 of 1000, 80.00%


Sampled: 900 of 1000, 90.00%


Sampled: 1000 of 1000, 100.00%

Predicted samples and summary statistics are given.

nBurn = 1000 . Iterations = 2000 .

Elapsed time: 0.56 Sec.

I have found the solution or the error. Your tip with the string - package has helped me very well. I have taken out all times one after the other and so I could identify the package which causes the error. It was the package "formula.tools" which overwrites the base paste() function. If I take out this package and restart r then the prediction works without error.

Here again the code:

    rm(list=ls())
    library(bmstdr)
    library(spTimer)

    set.seed(11)
    s <- sort(sample(unique(nysptime$s.index), size = floor((length(unique(nysptime$s.index))/100)*20)))
    DataFit <- spT.subset(data = nysptime, var.name = "s.index", s = s, reverse = T)
    DataValPred <- spT.subset(data = nysptime, var.name = "s.index", s = s, reverse = F)
    mod <- Bsptime(package = "spTimer",
               model = "GPP", 
               formula = as.formula(y8hrmax ~ xmaxtemp + xwdsp + xrh), 
               data = DataFit, 
               n.report = 5, 
               coordtype = "utm", 
               coords = 4:5, 
               scale.transform = "NONE", 
               g_size = 4,
               N = 2000, 
               mchoice = F,
               plotit = F)

# without error
pred.gp <- predict(mod$fit, tol.dist=0.0, newdata = DataValPred, newcoords = ~ Longitude + Latitude)

# with error if the package "formula.tools" is loaded
if(!require(formula.tools)){install.packages("formula.tools");library(formula.tools)}
pred.gp <- predict(mod$fit, tol.dist=0.0, newdata = DataValPred, newcoords = ~ Longitude + Latitude)

Great, then one error would be already done :=)

Thanks for the help. Because of the other things I write to the maintrainer.