Incorrect group names with multiple groups in frq()
jmbarajas opened this issue · comments
I'm trying to calculate frequency tables of a grouped tibble using two grouping variables that have two levels each. The output from frq()
seems to be both mislabeling and incompletely labeling the output. See the following example:
library(tibble)
library(dplyr)
library(sjmisc)
set.seed(1001)
df <- tibble(group1 = factor(c(rep(1, 50), rep(2, 50)),
labels = c("Group A", "Group B")),
group2 = factor(rep(1:2, 50), labels = c("Group X", "Group Y")),
values = factor(as.integer(runif(100, 1, 6)),
labels = c("Never", "Once per month",
"Twice per month",
"Once per week", "Once per day")))
df %>% group_by(group1, group2) %>% frq(values)
#>
#> Grouped by:
#> group1: Group A
#> group2: Group X
#>
#> # values <categorical>
#> # total N=25 valid N=25 mean=3.16 sd=1.46
#>
#> val frq raw.prc valid.prc cum.prc
#> Never 5 20 20 20
#> Once per month 3 12 12 32
#> Twice per month 6 24 24 56
#> Once per week 5 20 20 76
#> Once per day 6 24 24 100
#> <NA> 0 0 NA NA
#>
#> Grouped by:
#> group1: Group B
#> group2: Group Y
#>
#> # values <categorical>
#> # total N=25 valid N=25 mean=2.68 sd=1.52
#>
#> val frq raw.prc valid.prc cum.prc
#> Never 8 32 32 32
#> Once per month 5 20 20 52
#> Twice per month 3 12 12 64
#> Once per week 5 20 20 84
#> Once per day 4 16 16 100
#> <NA> 0 0 NA NA
#>
#> Grouped by:
#> group1: NA
#> group2: NA
#>
#> # values <categorical>
#> # total N=25 valid N=25 mean=2.96 sd=1.37
#>
#> val frq raw.prc valid.prc cum.prc
#> Never 5 20 20 20
#> Once per month 4 16 16 36
#> Twice per month 7 28 28 64
#> Once per week 5 20 20 84
#> Once per day 4 16 16 100
#> <NA> 0 0 NA NA
#>
#> Grouped by:
#> group1: NA
#> group2: NA
#>
#> # values <categorical>
#> # total N=25 valid N=25 mean=3.04 sd=1.43
#>
#> val frq raw.prc valid.prc cum.prc
#> Never 5 20 20 20
#> Once per month 4 16 16 36
#> Twice per month 6 24 24 60
#> Once per week 5 20 20 80
#> Once per day 5 20 20 100
#> <NA> 0 0 NA NA
Note that the grouping labels only consist of the first levels of group1
and group2
together or the second levels together (i.e. the diagonal of a 2x2 crosstab). When I compare the output to xtabs
, we see that the second frequency table is also mislabeled: it should be the summary of Group A and Group Y.
xtabs(~values + group2 + group1, df)
#> , , group1 = Group A
#>
#> group2
#> values Group X Group Y
#> Never 5 8
#> Once per month 3 5
#> Twice per month 6 3
#> Once per week 5 5
#> Once per day 6 4
#>
#> , , group1 = Group B
#>
#> group2
#> values Group X Group Y
#> Never 5 5
#> Once per month 4 4
#> Twice per month 7 6
#> Once per week 5 5
#> Once per day 4 5
I'm using the latest CRAN release of sjmisc
(v 2.7.9).
Oops: I see this was solved in the latest development build. Closing the issue.