Page 11 of 21

CM7.{5,8} | CM7.{5,8} | Study Designs, Association and Bias — SDL Guide (Part 2)

Measures of Association and Causation

Measures of association quantify the strength of relationship between exposure and outcome. Beyond relative measures (RR, OR), absolute measures (attributable risk) are important for public health decision-making.

Relative Risk (RR): as defined in the cohort study section — ratio of incidence rates. RR = 3 means exposed persons have 3× the risk. It measures strength of association — useful for aetiological inference (high RR suggests a true cause-effect relationship; low RR in a large study may be statistically significant but could reflect confounding).

Odds Ratio (OR): as defined in the case-control section — ratio of exposure odds. Approximates RR when disease is rare.

Attributable Risk (AR) (also called Risk Difference or Excess Risk) = Incidence in exposed − Incidence in unexposed. Measures the absolute excess risk attributable to the exposure. If the incidence of MI in smokers is 12 per 1,000/year and in non-smokers is 4 per 1,000/year, the AR = 8 per 1,000/year — 8 MI events per 1,000 persons per year would be prevented if smoking were eliminated.

Population Attributable Risk (PAR) = (Incidence in total population − Incidence in unexposed). Reflects the excess incidence in the total population (not just the exposed subgroup) that is attributable to the exposure. PAR takes account of the prevalence of the exposure in the population. A risk factor with a modest RR but very high prevalence (e.g. physical inactivity and cardiovascular disease) can have a large PAR and thus a large population-level impact from prevention.

Bradford Hill Criteria for Causation (1965): eight criteria for judging whether an observed statistical association reflects a causal relationship:
1. Strength of association (high RR/OR more likely to be causal)
2. Consistency (replicated across different populations, designs, investigators)
3. Specificity (one cause → one effect)
4. Temporality (exposure precedes outcome — the only necessary criterion)
5. Biological gradient (dose-response: more exposure → more disease)
6. Plausibility (biologically plausible mechanism)
7. Coherence (consistent with known biology and natural history)
8. Experiment (does reducing exposure reduce disease? — strongest evidence)

2x2 contingency table with cells labelled a/b/c/d, with formulas for relative risk and odds ratio annotated below — click to enlarge

Of these, temporality is the only necessary (not merely supportive) criterion — all others can be absent and causation can still exist, but causation cannot exist without the exposure preceding the outcome.

Bias and Confounding: Threats to Valid Inference

Even when a study finds a statistically significant association, the association may be spurious — produced by systematic errors in design or measurement. These systematic errors are classified as bias and confounding.

Bias is a systematic (non-random) error in the design, conduct, or analysis of a study that causes a deviation of the measured association from the true association. Bias does not decrease with larger sample sizes (unlike random error).

Selection bias occurs when the study participants are not representative of the source population:
- Berkson's bias: in hospital-based case-control studies, hospitalised controls may differ from the general population in ways that are related to both the exposure and the disease (e.g. selecting appendicitis patients as controls for a study of diet and colorectal cancer — appendicitis patients may have different diets than the general population).
- Healthy worker effect: occupational cohort studies tend to find lower mortality than the general population comparison because employed persons must be healthy enough to work — this biases toward underestimating occupational hazards.
- Loss to follow-up bias: if participants who are lost to follow-up differ systematically from those retained (e.g. sicker patients or those with adverse exposures are more likely to drop out), the observed association is distorted.

Information bias (measurement error) occurs when exposure or outcome is incorrectly classified:
- Recall bias: cases tend to remember past exposures more thoroughly than controls (because they are motivated to understand why they got sick). This is a systematic overestimation of exposure in cases relative to controls — a particular threat in case-control studies of exposures that are difficult to document objectively (diet, stress, physical activity).
- Observer bias (interviewer bias): the interviewer, knowing the subject's disease status, may probe more deeply for exposure history in cases than in controls. Blinding interviewers to case-control status mitigates this.
- Hawthorne effect: study participants modify their behaviour because they know they are being observed — a threat in intervention studies.

Confounding is fundamentally different from bias. A confounder is a third variable that: (1) is associated with the exposure of interest; (2) is associated with the outcome; and (3) is NOT an intermediate step in the causal pathway from exposure to outcome. Confounding creates a spurious association (or masks a real one). Classic example: a study finds that coffee drinking is associated with lung cancer (positive association). But coffee drinking is associated with cigarette smoking (confounder), and smoking causes lung cancer. When smoking is adjusted for, the coffee-cancer association disappears — it was entirely explained by confounding.

Methods to control confounding:
- At design stage: Randomisation (RCTs); Restriction (limit study to one stratum of the confounder, e.g. only non-smokers); Matching (match cases and controls on the confounder, e.g. age and sex)
- At analysis stage: Stratification (compute association separately within each stratum of the confounder); Multivariate analysis (statistical regression to adjust for multiple confounders simultaneously)

Distinguishing effect modification (interaction) from confounding: if the association between exposure and outcome differs across strata of a third variable (e.g. the effect of a drug is stronger in women than men), the third variable is an effect modifier, not a confounder. Effect modification should be reported and interpreted, not 'controlled away'.

CLINICAL PEARL

OR ≠ RR and the 'rare disease assumption' is routinely violated in medical literature. Odds Ratios from case-control studies are frequently reported and discussed as if they are Relative Risks — this error inflates the perceived strength of association. If a disease has a prevalence of 30% (not rare), an OR of 3.0 corresponds to an RR of only approximately 1.9 — a major overestimate. The rare disease assumption holds well when disease incidence is <5–10%. For common diseases (hypertension, obesity, depression — all with prevalence >10–20%), OR substantially overestimates RR. When reading research papers, check whether the authors correctly label their measure: a prevalence-based cross-sectional study reporting an 'OR' is technically reporting a prevalence odds ratio, and the rare disease assumption cannot rescue it if the disease is prevalent.

SELF-CHECK

A case-control study finds an association between passive smoking and asthma in children (OR=2.5). The authors worry that children exposed to passive smoking also tend to live in damp housing, which independently causes asthma. What is MOST accurately described as the role of damp housing in this study?

A. It is a selection bias

B. It is a confounder if it is associated with both passive smoking exposure and asthma

C. It is a mediator (intermediate variable) in the causal pathway

D. It is recall bias because parents may incorrectly recall housing conditions

Reveal Answer

Answer: B. It is a confounder if it is associated with both passive smoking exposure and asthma

Damp housing is a confounder if it meets all three criteria: (1) associated with the exposure (passive smoking exposure — children of smokers may be more likely to live in damp housing); (2) associated with the outcome (asthma); and (3) not an intermediate in the causal pathway from passive smoking to asthma (damp housing causes asthma independently, not AS A RESULT OF passive smoking exposure). A mediator would be a step through which passive smoking causes asthma (e.g. airway inflammation — that is part of the mechanism, not confounding). Selection bias and recall bias are different systematic error categories.

Interactive practice: Multiple Choice

Interactive practice: True / False