strengejacke / sjmisc

Data transformation and utility functions for R

Home Page:https://strengejacke.github.io/sjmisc

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

sjmisc::rec() - Recode char to numeric when char contains spaces

konstruktur opened this issue · comments

Hi!

I'd like to recode character vectors to numeric labelled variables via sjmisc::rec(), but could not make it work for character vectors where the element contain spaces. Did I miss something here or is this not possible (yet)?

Thanks for your time!
Best, G

# R version 3.6.1 (2019-07-05)
library(sjmisc)  # v2.8.2
library(sjlabelled)  # v1.1.1


### Some data: Two characters vectors in a data frame

x.char <- c("Category 1", "Category 2", "Category 3", NA)  # with spaces
y.char <- c("Category1", "Category2", "Category3", NA)     # w/o spaces

df <- data.frame(x.char, y.char, stringsAsFactors = FALSE)  # make a data frame
str(df)


### Now recoding to numeric labelled variables via sjmisc::rec()

### Without spaces (s. y.char)? Works!
df %>% sjmisc::rec(y.char, suffix = "_r", as.num = TRUE,
            rec = "
            Category1 = 1 [Label 1]; 
            Category2 = 2 [Label 2]; 
            Category3 = 3 [Label 3];") %T>% str() %>% frq(y.char_r)


# With spaces (1): Does not work
df %>% sjmisc::rec(x.char, suffix = "_r", as.num = TRUE, 
            rec = "
            Category 1 = 1 [Label 1]; 
            Category 2 = 2 [Label 2]; 
            Category 3 = 3 [Label 3];") %T>% str() %>% frq(x.char_r)


# With spaces (2): in quotation marks? Does not work
df %>% sjmisc::rec(x.char, suffix = "_r", as.num = TRUE, 
            rec = "
            'Category 1' = 1 [Label 1]; 
            'Category 2' = 2 [Label 2]; 
            'Category 3' = 3 [Label 3];") %T>% str() %>% frq(x.char_r)


# With spaces (3): in backticks? Does not work
df %>% sjmisc::rec(x.char, suffix = "_r", as.num = TRUE, 
            rec = "
            `Category 1` = 1 [Label 1]; 
            `Category 2` = 2 [Label 2]; 
            `Category 3` = 3 [Label 3];") %T>% str() %>% frq(x.char_r)


# With spaces (4): in quotation marks but without Labels? Does not work
df %>% sjmisc::rec(x.char, suffix = "_r", as.num = TRUE, 
            rec = "
            'Category 1' = 1; 
            'Category 2' = 2; 
            'Category 3' = 3;") %T>% str() %>% frq(x.char_r)

Hi,

thanks for your effort @iago-pssjd ! I think it might be more explicit and clear, if we put strings into quotation marks. E.g. I sometimes recode long open answers to numbers, so this could be e.g. ...

df %>% sjmisc::rec(x.char, suffix = "_r", as.num = TRUE, 
            rec = "
            'I think we should ban SUV cars, since they are bad on fuel, dangerous for pedestrians,    and they take up way too much of the valuable city space...   but that won't be easy' = 1 [ban SUV]; 
            'If they stopped commercials for SUV Cars, they would sell less - I think they should not be advertised' = 2 [ban SUV commericals]; 
            'Further remarks in open questions' = 3 [Label 3];")

In this case, it would be confusing not to use quotation marks (which bring syntax highlighting of the enquoted text). At the moment I do this kind of recoding via dplyr::recode() but it would be great to use sjmisc::rec() for this (where it's possible to put a [label] on it in the same step). What do you think?

Thanks a lot for your time!
Best, G

The requirement that the input values be used without single quotes or back ticks also makes it impossible to recode character variables which have empty strings in them. For example:

test_vec = c("a","b","c","")

Using sjmisc::rec to recode the "" element is currently not possible.

Is this a bug or just not implemented yet?

Thanks a lot @strengejacke - and sorry for my late response! Just tested with my code above and it works perfectly (Version "With spaces (1)") ! I'm still a bit suspicious of strings without " ", ' ' or though - but this works great! Thanks!