Posit Workbench keeps crashing when connecting to Spark in local mode
tweakyTweeter opened this issue · comments
Posit Workbench keeps crashing when trying to connect to Spark in local mode via sparklyr package. Expected output is to be able to connect to a spark instance. When I try to run spark_connect
using method = "test"
option, I get an error with respect to as_tibble
function as shown below. I tried downgrading various packages such as sparklyr, tibble, dplyr etc. but nothing seems to work. Would really appreciate if anyone has any suggestions to diagnose this issue as I'm drawing a blank and couldn't find any suggestions on Stakoverflow.
library(sparklyr)
#>
#> Attaching package: 'sparklyr'
#> The following object is masked from 'package:stats':
#>
#> filter
devtools::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#> setting value
#> version R version 3.6.3 (2020-02-29)
#> os Ubuntu 18.04.6 LTS
#> system x86_64, linux-gnu
#> ui X11
#> language (EN)
#> collate en_US.UTF-8
#> ctype en_US.UTF-8
#> tz Etc/GMT
#> date 2023-08-30
#> pandoc 2.19.2 @ /usr/lib/rstudio-server/bin/quarto/bin/tools/ (via rmarkdown)
#>
#> ─ Packages ───────────────────────────────────────────────────────────────────
#> package * version date (UTC) lib source
#> assertthat 0.2.1 2019-03-21 [1] CRAN (R 3.6.3)
#> base64enc 0.1-3 2015-07-28 [1] CRAN (R 3.6.3)
#> cachem 1.0.8 2023-05-01 [1] CRAN (R 3.6.3)
#> callr 3.7.3 2022-11-02 [1] CRAN (R 3.6.3)
#> cli 3.6.1 2023-03-23 [1] CRAN (R 3.6.3)
#> crayon 1.5.2 2022-09-29 [1] CRAN (R 3.6.3)
#> DBI 1.1.3 2022-06-18 [1] CRAN (R 3.6.3)
#> dbplyr 2.2.1 2022-06-27 [1] CRAN (R 3.6.3)
#> devtools 2.4.5 2022-10-11 [1] CRAN (R 3.6.3)
#> digest 0.6.33 2023-07-07 [1] CRAN (R 3.6.3)
#> dplyr 1.1.2 2023-04-20 [1] CRAN (R 3.6.3)
#> ellipsis 0.3.2 2021-04-29 [1] CRAN (R 3.6.3)
#> evaluate 0.21 2023-05-05 [1] CRAN (R 3.6.3)
#> fansi 1.0.4 2023-01-22 [1] CRAN (R 3.6.3)
#> fastmap 1.1.1 2023-02-24 [1] CRAN (R 3.6.3)
#> fs 1.6.3 2023-07-20 [1] CRAN (R 3.6.3)
#> generics 0.1.3 2022-07-05 [1] CRAN (R 3.6.3)
#> glue 1.6.2 2022-02-24 [1] CRAN (R 3.6.3)
#> htmltools 0.5.6 2023-08-10 [1] CRAN (R 3.6.3)
#> htmlwidgets 1.6.2 2023-03-17 [1] CRAN (R 3.6.3)
#> httpuv 1.6.11 2023-05-11 [1] CRAN (R 3.6.3)
#> httr 1.4.7 2023-08-15 [1] CRAN (R 3.6.3)
#> jsonlite 1.8.4 2022-12-06 [1] CRAN (R 3.6.3)
#> knitr 1.43 2023-05-25 [1] CRAN (R 3.6.3)
#> later 1.3.1 2023-05-02 [1] CRAN (R 3.6.3)
#> lifecycle 1.0.3 2022-10-07 [1] CRAN (R 3.6.3)
#> magrittr 2.0.3 2022-03-30 [1] CRAN (R 3.6.3)
#> memoise 2.0.1 2021-11-26 [1] CRAN (R 3.6.3)
#> mime 0.12 2021-09-28 [1] CRAN (R 3.6.3)
#> miniUI 0.1.1.1 2018-05-18 [1] CRAN (R 3.6.3)
#> pillar 1.9.0 2023-03-22 [1] CRAN (R 3.6.3)
#> pkgbuild 1.4.2 2023-06-26 [1] CRAN (R 3.6.3)
#> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 3.6.3)
#> pkgload 1.3.2.1 2023-07-08 [1] CRAN (R 3.6.3)
#> prettyunits 1.1.1 2020-01-24 [1] CRAN (R 3.6.3)
#> processx 3.8.2 2023-06-30 [1] CRAN (R 3.6.3)
#> profvis 0.3.8 2023-05-02 [1] CRAN (R 3.6.3)
#> promises 1.2.1 2023-08-10 [1] CRAN (R 3.6.3)
#> ps 1.7.5 2023-04-18 [1] CRAN (R 3.6.3)
#> purrr 1.0.2 2023-08-10 [1] CRAN (R 3.6.3)
#> R.methodsS3 1.8.2 2022-06-13 [1] CRAN (R 3.6.3)
#> R.oo 1.25.0 2022-06-12 [1] CRAN (R 3.6.3)
#> R.utils 2.12.2 2022-11-11 [1] CRAN (R 3.6.3)
#> R6 2.5.1 2021-08-19 [1] CRAN (R 3.6.3)
#> Rcpp 1.0.11 2023-07-06 [1] CRAN (R 3.6.3)
#> remotes 2.4.2.1 2023-07-18 [1] CRAN (R 3.6.3)
#> reprex 2.0.2 2022-08-17 [1] CRAN (R 3.6.3)
#> rlang 1.1.1 2023-04-28 [1] CRAN (R 3.6.3)
#> rmarkdown 2.24 2023-08-14 [1] CRAN (R 3.6.3)
#> rstudioapi 0.15.0 2023-07-07 [1] CRAN (R 3.6.3)
#> sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 3.6.3)
#> shiny 1.7.5 2023-08-12 [1] CRAN (R 3.6.3)
#> sparklyr * 1.8.2 2023-07-01 [1] CRAN (R 3.6.3)
#> stringi 1.7.12 2023-01-11 [1] CRAN (R 3.6.3)
#> stringr 1.5.0 2022-12-02 [1] CRAN (R 3.6.3)
#> tibble 3.2.1 2023-03-20 [1] CRAN (R 3.6.3)
#> tidyr 1.2.1 2022-09-08 [1] CRAN (R 3.6.3)
#> tidyselect 1.2.0 2022-10-10 [1] CRAN (R 3.6.3)
#> urlchecker 1.0.1 2021-11-30 [1] CRAN (R 3.6.3)
#> usethis 2.2.2 2023-07-06 [1] CRAN (R 3.6.3)
#> utf8 1.2.3 2023-01-31 [1] CRAN (R 3.6.3)
#> vctrs 0.6.3 2023-06-14 [1] CRAN (R 3.6.3)
#> withr 2.5.0 2022-03-03 [1] CRAN (R 3.6.3)
#> xfun 0.40 2023-08-09 [1] CRAN (R 3.6.3)
#> xtable 1.8-4 2019-04-21 [1] CRAN (R 3.6.3)
#> yaml 2.3.7 2023-01-23 [1] CRAN (R 3.6.3)
#>
#> [1] /usr/local/lib/remote_cran_repo/r_shared_libraries/R3.6
#> [2] /usr/local/lib/h2o/h2o-3.14.0.6
#> [3] /usr/local/lib/h2o/h2o-3.16.0.2
#> [4] /usr/local/lib/h2o/h2o-3.20.0.2
#> [5] /usr/local/lib/R/3.6.3/lib/R/library
#>
#> ──────────────────────────────────────────────────────────────────────────────
sc <- sparklyr::spark_connect(master = "local", method = "test")
rlang::last_trace(drop = FALSE)
#> <error/tibble_error_column_scalar_type>
#> Error in `as_tibble()`:
#> ! All columns in a tibble must be vectors.
#> ✖ Column `list(master = "local[40]", config = list(spark.env.SPARK_LOCAL_IP.local = "127.0.0.1", sparklyr.connect.csv.embedded = "^1.*", spark.sql.legacy.utcTimestampFunc.enabled = TRUE,
#> sparklyr.connect.cores.local = 40, spark.sql.shuffle.partitions.local = 40, sparklyr.shell.name = "sparklyr", \`sparklyr.shell.driver-memory\` = "2g"), state = <environment>)` is a
#> `spark_connection/test_connection/DBIConnection` object.
#> ---
#> Backtrace:
#> ▆
#> 1. └─.rs.connectionListObjects("Spark", "local - ")
#> 2. └─connection$listObjects(...)
#> 3. └─sparklyr:::connection_list_tables(scon, includeType = TRUE)
#> 4. ├─base::sort(dbListTables(sc))
#> 5. ├─DBI::dbListTables(sc)
#> 6. └─sparklyr (local) dbListTables(sc)
#> 7. └─sparklyr (local) .local(conn, ...)
#> 8. └─sparklyr:::df_from_sql(conn, query)
#> 9. └─sparklyr:::df_from_sdf(sc, sdf)
#> 10. └─sparklyr::sdf_collect(sdf)
#> 11. └─sparklyr:::sdf_collect_static(object, impl, ...)
#> 12. └─sparklyr:::sdf_collect_data_frame(sdf, collected)
#> 13. ├─tibble::as_tibble(fixed, stringsAsFactors = FALSE, optional = TRUE)
#> 14. └─tibble:::as_tibble.list(fixed, stringsAsFactors = FALSE, optional = TRUE)
#> 15. └─tibble:::lst_to_tibble(x, .rows, .name_repair, col_lengths(x))
#> 16. └─tibble:::check_valid_cols(x, call = call)
#> 17. └─tibble:::abort_column_scalar_type(...)
#> 18. └─tibble:::tibble_abort(...)
#> 19. └─rlang::abort(x, class, ..., call = call, parent = parent, use_cli_format = TRUE)
#>
Hi, what is the reason to use method = "test"
in your use-case? Wouldn't simply using spark_connect("local")
be sufficient?
If I use spark_connect("local")
RStudio instantly crashes and no error logs are generated for me to debug the issue. So I was trying with method = "test"
Ok, what kind of error message is Workbench displaying?
It just crashes the session without any error messages. Let me try with sparkly.log.console option and check if I can get any error messages.
Even with the options(sparklyr.log.console = TRUE)
command, the R session instantly crashes.