mattroumaya / doubleheadr

R package for cleaning column + sub-column/second row names from SurveyMonkey, or other similarly structured data. + helper functions for SurveyMonkey exports.

Home Page:https://mattroumaya.github.io/doubleheadr/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

doubleheadr 🎭

lifecycle

This package provides helper functions for cleaning data exports from SurveyMonkey.

The Main Issue

SurveyMonkey provides response data that contains a column and a second row containing column-name data, which requires data cleaning before starting your analysis.

Why doubleheadr?

  • Adopting clean_headr + trim_headr will (hopefully) make your workflow more efficient.

  • Quick and simple approach when working with downloaded or inherited .csv or .xlsx files, or when there are too many responses to pull via API. (Another highly recommended solution for Advantage and Premier-level users is pulling data via the SurveyMonkey API with the surveymonkey package.)

Lifecycle

This package is in the early stages of development. Any and all issues are welcome, please report them and I will be happy to troubleshoot anything that comes up.

Overview

  • clean_headr concatenates values from column names and the first row so that it makes sense.

  • trim_headr trims long strings from column names.

  • flag_mins flags respondents not meeting a minimum duration in minutes to complete the survey

Install doubleheadr

# install.packages("devtools")
# library(devtools)
devtools::install_github('mattroumaya/doubleheadr')

Usage

Your downloaded or inherited .csv/.xlsx file will look something like the demo included in doubleheadr.

No worries though! Cleaning your column names is as easy as 1,2… that’s it! Two steps.

Start with unhelpful column names:

colnames(doubleheadr::demo)
##  [1] "Respondent ID"                                
##  [2] "Please provide your contact information:"     
##  [3] "...3"                                         
##  [4] "...4"                                         
##  [5] "...5"                                         
##  [6] "...6"                                         
##  [7] "...7"                                         
##  [8] "...8"                                         
##  [9] "...9"                                         
## [10] "...10"                                        
## [11] "...11"                                        
## [12] "I wish it would have snowed more this winter."
1. clean_headr
  • Now you have some really long column names

  • You can make them easier to read by setting clean_names == FALSE

demo %>% 
  clean_headr(., "...") %>% 
  colnames(.)
##  [1] "respondent_id"                                          
##  [2] "please_provide_your_contact_information_name"           
##  [3] "please_provide_your_contact_information_company"        
##  [4] "please_provide_your_contact_information_address"        
##  [5] "please_provide_your_contact_information_address_2"      
##  [6] "please_provide_your_contact_information_city_town"      
##  [7] "please_provide_your_contact_information_state_province" 
##  [8] "please_provide_your_contact_information_zip_postal_code"
##  [9] "please_provide_your_contact_information_country"        
## [10] "please_provide_your_contact_information_email_address"  
## [11] "please_provide_your_contact_information_phone_number"   
## [12] "i_wish_it_would_have_snowed_more_this_winter_response"
2. trim_headr
demo %>% 
  clean_headr(., "...") %>% 
  trim_headr(., c("please_provide_your_contact_information_",
                  "i_wish_it_would_have_",
                  "_response")) %>% 
  colnames(.)
##  [1] "respondent_id"           "name"                   
##  [3] "company"                 "address"                
##  [5] "address_2"               "city_town"              
##  [7] "state_province"          "zip_postal_code"        
##  [9] "country"                 "email_address"          
## [11] "phone_number"            "snowed_more_this_winter"
flag_mins
  • Coming to the demo soon…

About

R package for cleaning column + sub-column/second row names from SurveyMonkey, or other similarly structured data. + helper functions for SurveyMonkey exports.

https://mattroumaya.github.io/doubleheadr/

License:Other


Languages

Language:R 100.0%