sparklyr / sparklyr

R interface for Apache Spark

Home Page:https://spark.rstudio.com/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

spark_install gives cryptic error

mhamilton723 opened this issue · comments

When using spark_install(version = "3.2.4", hadoop_version = "3.2") with sparklyr 1.8.1 in our build system we are encountering the issue:

Error in validVersions[with(validVersions, order(hadoop, decreasing = TRUE)),  : 
  subscript out of bounds
Calls: spark_install ... spark_install_find -> spark_versions -> lapply -> FUN

It seems others encounter this as well:
https://stackoverflow.com/questions/76523973/error-in-validversionswithvalidversions-orderhadoop-decreasing-true

This has happened without code changes and started occurring a week ago roughly. Thanks so much for your help and work on this lovely library

Hi, yes, that's fixed in the dev version. So it'll be fixed in the next CRAN version.

If you'd like to try it now, feel free to install sparklyr from Github:

devtools::install_github("sparklyr/sparklyr")

just to echo, i have the same problem - but it is not solved by the dev version.

I get a slightly different error when running spark_install(version = "3.2.4", hadoop_version = "3.2")

Error in rbind(deparse.level, ...) : 
  invalid list argument: all variables should have the same length

I've also found quite a few niggles with the latest versions, which - if nothing else - might affect documentation. For example, i need to explicitly set spark_home in spark_connect, otherwise I get the same rbind(deparse.level) error as above. It means the "Getting Started" advice won't for many people "get you started".