solmos / EXPLICA

Power calculations for EXPLICA project

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Introduction

We are interested in comparing the levels of three biomarkers in the different expotype categories. I used simulated data sets to repeatedly fit linear regression models for different data generating scenarios. The power to detect a statistically significant difference was estimated by computing the proportion of models that resulted in at least one significant difference.

The following summary statistics were used for generating the simulated data sets:

Biomarker Units Male mean (SD) Female Mean (SD) Size Reference
Homocysteine micromol/L 14.6 (6.1) 13.1 (4.6) 3,025
ApoB mg/dL 113.9 (31.0) 107.0 (32.1) 1,501
hs-CRP mg/L 3.19 (5.28) 3.35 (5.37) 5,072

Simulated datasets

Simulated data was generated by the following linear model

$$y_i \sim \text{Normal}(\mu_i, \sigma^2)$$

$$\mu_i = \beta_0 + \beta_1 E_{1i} + \beta_2 E_{2i} + \ldots + \beta_k E_{ki}$$

where $E_{ki}$ is an indicator variable of the expotype category $k = {1, 2, \ldots, K - 1}$ of individual $i$.

Power calculation

Statistical power was estimated simulating 1,000 simulated datasets for each combination of the following parameters:

  • Sample size: 1,000, 900, 800

  • Number of expotype categories: 5-8

  • Maximum true difference in mean biomarker level

The maximum true difference in this case is set by giving one of the $\beta_1, \ldots, \beta_k$ a given value and an equal or smaller value to the others.

The power to detect a significant difference in at least one of the expotype categories is estimated by fitting the data generating model to each data set and testing the null hypothesis

$$H_0: \mu = \beta_0$$

which is given by the F-test in R's lm() function.

The proportion of rejected null hypotheses for each sample size and maximum true difference is reported for each of the three biomarkers.

Code

  • functions.R contains the necessary functions to perform the simulation and calculate and plot power.

  • power-calc.R contains the pipeline for running the simulation and power calculations.

About

Power calculations for EXPLICA project


Languages

Language:R 100.0%