Lakens / detect_effect

How good are we at detecting effects?

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Guessing the Presence or Absence of Effects

Cohen (1962) was the first to give benchmarks to interpret effect size estimates. For a t-test he suggested differences expressed in terms of the standard deviation (or, in other words, standardized mean differences, now known as Cohen’s d). In his 1962 paper he suggested values for small, medium, and large effect sizes of 0.25, 0.5. and 1. As he mentions, “These values are necessarily somewhat arbitrary, but were chosen so as to seem reasonable” (p. 146). In his classic book on power analysis for the behavioral sciences Cohen proposed effect size benchmarks for small, medium, and large effects of 0.2, 0.5, and 0.8. He writes: “In the face of this relativity, there is a certain risk inherent in offering conventional operational definitions for these terms for use in power analysis in as diverse a field of inquiry as behavioral science. This risk is nevertheless accepted in the belief that more is to be gained than lost by supplying a common conventional frame of reference which is recommended for use only when no better basis for estimating the ES index is available.”

His reasoning for these three benchmarks is as follows:

These values seem reasonable. For example, an 8-point mean IQ difference is large enough to be noticeable; this is the order of magnitude of the difference between people in professional and managerial occupations and also between clerical and semiskilled workers (Super, 1949, p. 98). Differences half this size (small) would not be readily perceptible; e.g., the mean IQ difference between twins and nontwins (Rusen, 1959); differences twice this size (large) would be so obvious as to virtually render a statistical test superfluous, e.g., the mean IQ difference between college graduates and those with only a 50-50 chance of passing in an academic high school curriculum (Cronbach, 1960, p. 174). According to Cohen (1988) for a small effect size the "signal" is difficult to detect, a medium effect size is conceived as one ‘large enough to be visible to the naked eye’, and large effects are ‘grossly perceptible’.

In this assignment we will explore what it means for signals to be noticeable and not noticeable. You will sample randomly generated datapoints from two groups (represented by a circle and a square). An example a single sampled datapoint is visible below. A datapoint for the square group is sampled, and the value is -1 on a scale from -7 to 7.

The two groups either have the same population mean, in which case the difference is d = 0, or there is a true difference of d = 0.2, 0.5, or 0.8. The real difference between the two groups will be randomly decided by the app (and shown after you make your decision). If there is an effect, it can be positive or negative (i.e., squares can have a higher or lower mean than circles).

Your task is to sample one datapoint at a time, and judge when you feel 80% confident in either the presence of an effect, or the absence of an effect. You will do this task 30 times. So in order to get 80% accuracy, you should guess correctly 24 of the 30 times. Doing this task 30 times will take a while, but it is necessary to meet the educational goals. First of all, due to random variation, trials will differ slightly from each other. Therefore, we have to repeat each situation a couple of times. Second, the goal is to experience the differences between effect sizes, and thus you will have to do the task multiple times for each effect size (for a standardized mean difference of 0, 0.2, 0.5, and 0.8). Nevertheless, I think it is worth it to experience what random variation in data looks like, which effect sizes are easy or more difficult to notice, and how often you make a mistake.

The online app will inform you about whether you were right or wrong after each trial. It will also keep track of how often you did the task, and how often you guessed the effect correctly. You will have to enter your student ID (a student ID at the TU Eindhoven has 7 digits and most likely starts with 08, 09, 10, 11, 12 or 13). Read the instructions in the app carefully before starting. If you have done 30 trials, you will have passed this assignment.

You can find the app online: http://shiny.ieis.tue.nl/guess_the_effect/

About

How good are we at detecting effects?


Languages

Language:HTML 99.6%Language:R 0.4%