prestodb / RPresto

DBI-based adapter for Presto for the statistical programming language R.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

as.double(x) and as.numeric(x) translate into 'CAST(x AS NUMERIC)' where NUMERIC isn't a Presto data type

danielsday opened this issue · comments

I am using RPresto and dbplyr to connect to a Presto SQL server. I am running into this issue where I am trying to cast a value extract from a JSON string into a double on the backend. I tried using as.double(x) and as.numeric(x) in the particular expression in question to have dbplyr and RPresto convert this into a CAST(...) on the SQL side. Both of these conversion functions do create the appropriate syntax, but note the right data type on the Presto side. A simplified example can be seen here:

> dbplyr::translate_sql(as.numeric(x))
<SQL> CAST("x" AS NUMERIC)
> dbplyr::translate_sql(as.double(x))
<SQL> CAST("x" AS NUMERIC)

Is this a fixed behavior of the system? Is there a more appropriate R conversion function to use (since the obvious ones aren't making the conversion that I expected given this database)? I can just query the tables, pull down the values from the tables as strings and then convert them into doubles on my end in R, but it would be nice if I could do this on the backend.

The as.numeric etc. functions are actually defined in dbplyr, we don't override them in RPresto: https://github.com/tidyverse/dbplyr/blob/master/R/translate-sql-base.r#L135-L138.

Technically, we support using as() as follows:

(
  RPresto::src_presto(...)
  %>% tbl(sql('select 1 as column'))
  %>% mutate(cast_column=as(column, 1.0))
  %>% show_query()
)

where the second argument is an R quantity of the type you would like to receive. However, I have been using this form for a while now and I'm not happy with it. I find it unintuitive. The alternative would be to allow a character value for the second argument spelling out the data type to be cast to. That leaks presto data types into R space though. So, I would not rely on that function to stay as is.

I will implement the correct as.* functions after we release 1.3.1 to CRAN. In the meantime, you can pass the raw sql for cast'ing as follows:

(
  RPresto::src_presto(...)
  %>% tbl(sql('select 1 as column'))
  %>% mutate(cast_column=sql('cast(column as double)'))
  %>% show_query()
)