IALSA / HRS

Shaping data from the Health and Retirement Study.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Developing data documentation

andkov opened this issue · comments

  • ./manipulation/0-ellis-island.R reads in .sav files and saves them as .rds files. This step is necessary to speed up processing, as large SPSS files may take too much time to load.

We are using RAND files (preprocessed version of HRS source files). In their raw form HRS files store each questionnaire individual file and information is gathered at the household level. RAND restructures the files to organize data on the person level and joins most of the files from one wave into a single SPSS file.

The task of the 1-scale-assembly.R script is to select only the needed columns from the large files for each wave, and rename the variables of interest in a consistent and explanatory manner.

To accomplish this, we created a meta-file renaming-rules.xlsx, each tab of which contains the instructions on how to rename the old variable names, whether the values for the item need to be reverse-coded, and whether a variable is longitudinal.

image

in meeting wtih @casslbrown:

in the renaming scheme the original names of variables contain information about the year and section
For example: mlb001a

  • m represents year, 2010
  • lb represents section, Leave Behind
  • '001a` item name. Tend to be the same across the years. Letters typically indicate that the items belong to the same questionnaire.

To review the items in the original context:

Note: RAND file do not have their own documentation, therefore stick to the original codebooks.

HRS consists of multiple questionnaires. To view the scope of questions in each section and their temporal spread, visit Documentaion -> Questionnaires.

Reports on particular topic areas are presented in User Guides. Fore example, the Leave Behind (LB) questionnaire is discussed in Psychosocial and Lifestyle Questionnaire 2006 - 2010: Documentation Report