Observational Studies

9. Internal Validity

Causal Inference: Internal Validity

Modern discussions of causal inference are based on how statisticians have come to think about cause and effect. The statistical framework was developed by Neyman in 1923 and later extended by Rubin (1974) and Holland (1986). It is sometimes called the "Rubin Causal Model."

Example 4

Rubin defines a causal effect:

Intuitively, the causal effect of one treatment, E, over another, C, for a particular unit and an interval of time from t1 to t2 is the difference between what would have happened at time t2 if the unit had been exposed to E initiated at t1 and what would have happened at t2 if the unit had been exposed to C initiated at t1: 'If an hour ago I had taken two aspirins instead of just a glass of water, my headache would now be gone,' or because an hour ago I took two aspirins instead of just a glass of water, my headache is now gone.' Our definition of the causal effect of the E versus C treatment will reflect this intuitive meaning.

According to the RCM, the causal effect of your taking or not taking aspirin one hour ago is the difference between how your head would have felt in case 1 (taking the aspirin) and case 2 (not taking the aspirin). If your headache would remain without aspirin but disappear if you took aspirin, then the causal effect of taking aspirin is headache relief (Rubin, 1974:689).

There are observational units: people, neighborhoods, police departments, schools, business establishments or other entities. In the simplest case, there is a binary intervention. Some of the units are exposed to an intervention, and the other units are exposed to an alternative. The intent is to estimate the intervention's causal effect. The classic examples come from research on the impact of job training, housing vouchers, or instructional innovations in schools. But equally interesting studies use larger observational units. For example, one might want to learn how sanctions applied to employers who hire undocumented workers affect the flow of immigrants into and out of a given state. Or, one could examine the possible impact on water conservation of a city's educational campaigns during a drought.
Neyman, Jerzy. 1923 [1990]. On the Application of Probability Theory to Agricultural Experiments. Essay on Principles. Statistical Science 5 (4): 465–472. Translation by Dorota M. Dabrowska and Terence P. Speed.
Rubin, D. (1974) Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of Educational Psychology 66: 688-701.
Holland, P. (1986). Statistics and causal inference. Journal of the American Statistical Association 8: 945-60.