MSSQL - pivot_wider - BIT - ERROR
FlorianSchwendinger opened this issue · comments
Using pivot_wider
on a tbl
connected to a MSSQL database gives an error, since MAX
is not allowed on BIT.
Versions
R version 4.4.1
dbplyr_2.5.0
dplyr_1.1.4
DBI_1.2.3
tidyverse_2.0.0
Microsoft SQL Server 2014 (SP3-CU4-GDR) (KB5029185) - 12.0.6449.1 (X64)
Example
library("DBI")
library("dbplyr")
library("tidyverse")
uid <- c(1L, 1L, 2L, 3L, 3L, 3L)
idx <- c(1L, 2L, 1L, 1L, 2L, 3L)
num <- rnorm(length(idx))
bit <- rnorm(length(idx)) > 0
df <- tibble(uid = uid, idx = idx, num = num, bit = bit)
df
so far everything works as expected. We connect to the server and the tbl
db <- dbConnect(odbc::odbc(),
driver = "{SQL Server}",
server = "",
database = "",
trusted_connection = "yes",
encoding = "latin1")
writeLines(unlist(dbGetQuery(db, "SELECT @@VERSION")))
#R> Microsoft SQL Server 2014 (SP3-CU4-GDR) (KB5029185) - 12.0.6449.1 (X64)
#R> Jul 27 2023 21:55:46
dbWriteTable(conn = db, name = "temp_test_wide", value = df, overwrite = TRUE)
dbf <- tbl(db, "temp_test_wide")
Now executing pivot_wider
gives an error on the bit column.
values_from <- setdiff(colnames(dbf), c("uid", "idx"))
pivot_wider(dbf, id_cols = "uid", names_from = "idx", values_from=all_of(values_from))
#R> Error in `collect()`:
#R> ! Failed to collect lazy table.
#R> Caused by error in `<fn>`:
#R> ! ODBC failed with error 42000 from [Microsoft][ODBC SQL Server Driver][SQL Server].
#R> ✖ Operand data type bit is invalid for max operator.
#R> • Statement(s) could not be prepared.
#R> • <SQL> 'SELECT TOP 11
#R> • "uid",
#R> • MAX(IIF("idx" = 1, "num", NULL)) AS "num_1",
#R> • MAX(IIF("idx" = 2, "num", NULL)) AS "num_2",
#R> • MAX(IIF("idx" = 3, "num", NULL)) AS "num_3",
#R> • MAX(IIF("idx" = 1, "bit", NULL)) AS "bit_1",
#R> • MAX(IIF("idx" = 2, "bit", NULL)) AS "bit_2",
#R> • MAX(IIF("idx" = 3, "bit", NULL)) AS "bit_3"
#R> • FROM "temp_test_wide"
#R> • GROUP BY "uid"'
#R> ℹ From nanodbc/nanodbc.cpp:1783.
#R> Run `rlang::last_trace()` to see where the error occurred.
Using show_query
shows the source of the problem, the SQL generated looks like
SELECT
"uid",
MAX(IIF("idx" = 1, "num", NULL)) AS "num_1",
MAX(IIF("idx" = 2, "num", NULL)) AS "num_2",
MAX(IIF("idx" = 3, "num", NULL)) AS "num_3",
MAX(IIF("idx" = 1, "bit", NULL)) AS "bit_1",
MAX(IIF("idx" = 2, "bit", NULL)) AS "bit_2",
MAX(IIF("idx" = 3, "bit", NULL)) AS "bit_3"
FROM "temp_test_wide"
GROUP BY "uid"
but should be
SELECT
"uid",
MAX(IIF("idx" = 1, "num", NULL)) AS "num_1",
MAX(IIF("idx" = 2, "num", NULL)) AS "num_2",
MAX(IIF("idx" = 3, "num", NULL)) AS "num_3",
CAST(MAX(CAST(IIF("idx" = 1, "bit", NULL) as INT)) AS BIT) AS "bit_1",
CAST(MAX(CAST(IIF("idx" = 2, "bit", NULL) as INT)) AS BIT) AS "bit_2",
CAST(MAX(CAST(IIF("idx" = 3, "bit", NULL) as INT)) AS BIT) AS "bit_3"
FROM "temp_test_wide"
GROUP BY "uid"
Looking at you code in backend-mssql.R
I see you have a mssql_bit_int_bit
function for MAX
.
However for pivot_wider
it seams to be missed.
Changing the argument values_fn
to
values_fn <- list(
num = ~ max(.x, na.rm = TRUE),
bit = ~ as.logical(max(as.integer(.x), na.rm = TRUE))
)
resolves the problem.
Hope this helps.