sparklyr / sparklyr

R interface for Apache Spark

Home Page:https://spark.rstudio.com/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Warnings with binary files

gregleleu opened this issue · comments

Doing any manipulation with binary files produces warnings

spark_version <- "3.4.0"
sc <- spark_connect(master = "local", version = spark_version)

# your reproducible example here
binary_sdf <- spark_read_binary(sc, 
  recursive_file_lookup = TRUE, 
  dir = "./tests/testthat/data/test_spark_read_binary_recursive_file_lookup", 
  name = "test")
binary_sdf %>% select(content)

# # Source: spark<?> [?? x 1]
#   content  
#   <list>   
# 1 <raw [4]>
# 2 <raw [4]>
# 3 <raw [4]>
# 4 <raw [4]>
# Warning message:
# In FUN(X[[i]], ...) : out-of-range values treated as 0 in coercion to raw

Thanks @gregleleu ! I'm seeing the same in 3.4, and 3.0, but in my laptop. I don't see that in the CI test on the GitHub actions. Is where you testing a Mac?

I'm on a mac yes, but it also shows up in the Sedona CI: https://github.com/apache/sedona/actions/runs/5603748794/jobs/10250817504 (section "run tests").
I don't get an error either when I run the sparklyr tests, they don't call select.