Page 10 of 13
CM6.1-6 | Biostatistics for Community Medicine — Practice Quiz
Click any question card to reveal the correct answer.
A community survey records blood group (A, B, AB, O) for each participant. The blood group variable is best classified as:
Correct. Blood group is nominal — categories have no inherent order or magnitude.
Scales of measurement: nominal (unordered categories) → ordinal (ordered categories) → interval (equal intervals, no true zero) → ratio (equal intervals, true zero). This hierarchy determines which summary statistics and tests are valid.
Incorrect. Blood group categories (A, B, AB, O) have no rank or numerical meaning, making the variable nominal, not ordinal or quantitative.
Click to reveal answer
A nutritionist measures the weight of 64 children under five in a rural block and calculates a mean weight of 12.4 kg, standard deviation (SD) = 2.0 kg. The standard error of the mean (SE) is:
Correct. SE = SD / √n = 2.0 / √64 = 2.0 / 8 = 0.25 kg.
SD measures variability among individuals in the sample. SE = SD/√n measures precision of the sample mean as an estimate of the population mean. As n increases, SE decreases (more precision), but SD remains relatively stable.
Incorrect. SE = SD / √n. With SD = 2.0 and n = 64, SE = 2.0/8 = 0.25 kg.
Click to reveal answer
A district health officer compares the proportion of children with stunting in an urban slum (45/200) versus a rural village (60/200). Both samples are independent. Which statistical test is most appropriate?
Correct. Comparing two proportions (categorical outcome) in independent groups → chi-square test is appropriate, provided expected frequencies are ≥5 in each cell.
Test selection algorithm: (1) What type of data? Categorical → chi-square (or Fisher's exact if expected cell count <5). Continuous → parametric (t-test/ANOVA) if normally distributed, or non-parametric (Mann-Whitney). (2) How many groups? (3) Paired or independent? Here: categorical data, 2 independent groups → chi-square.
Incorrect. The outcome here is categorical (stunted vs not stunted), not a continuous measurement. The chi-square test is used for categorical data in independent groups.
Click to reveal answer
A randomised trial of a new oral rehydration formula vs standard ORS reports a reduction in duration of diarrhoea of 0.3 hours (95% CI: 0.1–0.5 hours, p = 0.002). Which statement is MOST accurate?
Correct. p = 0.002 means: assuming H₀ (no difference) is true, the probability of observing this result (or more extreme) by chance is 0.2%. It does NOT directly equal 'probability that H₀ is true'.
p-value: probability of results as extreme or more extreme, GIVEN that H₀ is true. It is NOT: (a) probability that H₀ is true, (b) probability of a replication, or (c) proof of clinical importance. A 0.3-hour reduction in diarrhoea duration, while statistically significant, may not be clinically meaningful. Always report effect size + CI alongside p.
Incorrect. The p-value is NOT the probability that the null hypothesis is true, nor does statistical significance guarantee clinical significance.
Click to reveal answer
A researcher compares mean haemoglobin (g/dL) across three age groups of women (18-25 yr, 26-35 yr, 36-45 yr) in a single survey. Normality and homogeneity of variance are confirmed. The appropriate test is:
Correct. Comparing means of a continuous, normally distributed variable across 3 independent groups → one-way ANOVA.
One-way ANOVA compares means of a continuous, normally distributed outcome across ≥3 independent groups while controlling the Type I error rate. Running multiple t-tests inflates the family-wise error rate (e.g., 3 tests at α=0.05 → actual error ≈14%). Kruskal-Wallis is the non-parametric equivalent when normality is violated.
Incorrect. When comparing a normally distributed continuous outcome across ≥3 independent groups, one-way ANOVA is the correct test.
Click to reveal answer
Daily income data from 200 households in a slum colony are highly right-skewed. The MOST appropriate measure of central tendency to report for this dataset is:
Correct. The median is resistant to extreme values and is the preferred measure of central tendency for skewed distributions.
Mean = sum/n; sensitive to outliers. Median = middle value; robust to skew. Mode = most frequent value. For right-skewed data (positive skew, e.g., income, hospital stay): Mean > Median > Mode. Use median. Geometric mean is useful for multiplicative data (e.g., antibody titres, growth rates), not routine income data.
Incorrect. The arithmetic mean is pulled by outliers (a few very high incomes inflate it). For skewed data, the median best represents the typical value.
Click to reveal answer
A field epidemiologist selects every 10th household from a village list of 500 households, starting with household #7 (chosen at random). This sampling method is:
Correct. Selecting every kth element from a list, after a random start, defines systematic random sampling.
Systematic sampling: sampling interval k = N/n (500/50 = 10 here); random start between 1 and k. Advantage: simple, evenly spread. Disadvantage: if periodicity in the list coincides with k, estimates can be biased. Compare with simple random (each unit independently chosen by lottery/random numbers), stratified (divide into strata, then random within), and cluster (divide into clusters, randomly select whole clusters).
Incorrect. In simple random sampling every element has an equal independent chance of selection. Selecting every kth unit after a random start is systematic random sampling.
Click to reveal answer
A 2×2 contingency table is constructed to assess the association between tobacco use (yes/no) and oral cancer (yes/no) in a case-control study. One cell has an expected frequency of 3. The recommended modification is:
Correct. When any expected cell frequency is <5 in a 2×2 table, chi-square assumptions are violated and Fisher's exact test should be used.
Chi-square validity requirement: ALL expected frequencies ≥5 (some texts allow ≥1 if no more than 20% of cells <5, but the ≥5 rule is standard for NMC exams). When violated in a 2×2 table → Fisher's exact test. Yates' continuity correction is an older approach but Fisher's exact is preferred. ANOVA is for continuous outcomes; Bonferroni corrects for multiple comparisons.
Incorrect. Chi-square requires all expected frequencies ≥5. With an expected cell of 3, Fisher's exact test is the appropriate alternative.
Click to reveal answer
Statement 1 (Assertion):
The Wilcoxon signed-rank test is preferred over the paired t-test when the differences between paired observations are not normally distributed.
BECAUSE
Statement 2 (Reason):
Non-parametric tests make no assumption about the underlying population distribution and are therefore valid when normality cannot be assumed.
Select the correct relationship:
Correct. Both are true. Non-parametric tests (e.g., Wilcoxon signed-rank) do not assume normal distribution of the differences, making them valid when the paired t-test's normality assumption is violated — and this is exactly why they are preferred in that context.
Parametric tests (t-test, ANOVA) assume normally distributed data (or large n via CLT). Non-parametric equivalents: paired t-test → Wilcoxon signed-rank; independent t-test → Mann-Whitney U; ANOVA (one-way) → Kruskal-Wallis. Non-parametric tests use ranks instead of raw values, making them robust to outliers and non-normality.
Incorrect. The assertion is correct: Wilcoxon signed-rank is appropriate for non-normal paired differences. The reason is also correct: non-parametric tests do not assume normality. Furthermore, the reason directly explains the assertion.
Click to reveal answer
CLINICAL SCENARIO
A medical student records fasting blood glucose (mg/dL) for 10 participants in a community camp: 82, 88, 90, 92, 95, 98, 100, 105, 110, 140. She is asked to summarise the dataset and assess its distribution before applying any statistical test.
Answer the following questions based on the scenario above.
Click to reveal answer
The value 140 mg/dL is substantially higher than the rest. Which measure of central tendency is LEAST affected by this outlier?
Correct. The median (average of the 5th and 6th values = (95+98)/2 = 96.5 mg/dL) is not influenced by the outlier 140 mg/dL.
Incorrect. The arithmetic mean would be pulled up by 140 mg/dL. The median is resistant to outliers.
Click to reveal answer
The student calculates SD = 16.2 mg/dL and IQR = 17.5 mg/dL (Q1=89, Q3=106.25). Given the presence of an outlier, which pair of summary statistics should she report?
Correct. For skewed data with outliers, the median (robust centre) and IQR (robust spread, Q1–Q3) are the preferred summary pair.
Incorrect. SD is sensitive to outliers (each deviation is squared, amplifying the effect). Median + IQR together resist outlier distortion.
Click to reveal answer