atorus-research / xportr

Tools to build CDISC compliant data sets and check for CDISC compliance.

Home Page:https://atorus-research.github.io/xportr/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Question: Should `metadata` be stored in output of `xportr_*`?

averissimo opened this issue · comments

Just as domain currently is under the "xportr.df_arg" attribute.

Take the example below with the current behavior, my specific question is if

  • xportr_type(.df, custom_metadata, domain = "adsl") |> xportr_order()

Should work, or should we keep the current API that repeats custom_metadata across calls if not set with function:

  • xportr_type(.df, custom_metadata, domain = "adsl") |> xportr_order(custom_metadata)

edit: I guess this boils down to how common it is the usage of multiple variables to hold "metadata" (and call the xportr_* function)

ps. same for verbose (given #151)

library(xportr)
custom_metadata <- data.frame(
  dataset = "adsl",
  variable = c("Subj", "Param", "Val", "NotUsed"),
  type = c("numeric", "character", "numeric", "character"),
  format = NA,
  order = 4:1
)

.df <- data.frame(
  Subj = as.character(123, 456, 789),
  Different = c("a", "b", "c"),
  Val = c("1", "2", "3"),
  Param = c("param1", "param2", "param3")
)

xportr_type(.df, custom_metadata, domain = "adsl") |>
  xportr_order()
#> 
#> ── Variable type mismatches found. ──
#> 
#> ✔ 2 variables coerced
#> Error in `xportr_order()`:
#> ! Metadata must be set with `metadata` or `xportr_metadata()`
#> Backtrace:
#>     ▆
#>  1. └─xportr::xportr_order(xportr_type(.df, custom_metadata, domain = "adsl"))
#>  2.   ├─metadata %||% attr(.df, "_xportr.df_custom_metadata") %||% ... at xportr/R/order.R:87:3
#>  3.   └─rlang::abort("Metadata must be set with `metadata` or `xportr_metadata()`") at xportr/R/order.R:87:3

xportr_metadata(.df, custom_metadata, domain = "adsl") |>
  xportr_type() |>
  xportr_order()
#> 
#> ── Variable type mismatches found. ──
#> 
#> ✔ 2 variables coerced
#> 
#> ── 1 variables not in spec and moved to end ──
#> 
#> ── 4 reordered in dataset ──
#> 
#>   Val  Param Subj Different
#> 1   1 param1  123         a
#> 2   2 param2  123         b
#> 3   3 param3  123         c

Created on 2023-12-15 with reprex v2.0.2

I think the idea behind xportr_metadata() was to help address some of the repeating nature of our functions and partly address your concern.

It would be quite cool if the functions could take the inputs from the other functions. Can we do that? If so, it would sort of make xportr_metadata() obsolete.

@atorus-research/xportr-development-team

We are currently doing this with domain and I wouldn't say it would make that function obsolete, just redundant for sake of being explicit.

This concern is with having conflicting behaviors, domain is invisibly being passed forward, while metadata and verbose are not.

What does the rest of the folks think?

From an implementation perspective, I think this would be pretty simple - Port the work that metadata() is doing into each of the other functions to hold on to the argument values. I think it would be worth passing all the argument values on