Difference of Means Tests: The Basics – Texas Political Science

A DIFFERENCE OF MEANS TEST (also called a TWO-SAMPLE T-TEST) is used to compare the average values of a variable across two different groups (samples) to see whether they statistically differ from each other. The process of conducting a difference of means test mirrors the hypothesis testing process. Specifically, it involves:

Formulating null and alternative hypotheses — for example:
- Null hypothesis (H₀): there is no difference in the average values of a variable across two groups
- Research hypothesis (H_a): there is a difference in the average values of a variable across two groups
Choosing a significance level (usually, α = 0.10, 0.05, or 0.01)
Checking assumptions regarding the relationship between the groups and the nature of the data
Choosing the appropriate difference of means test/two-sample t-test
Calculating the test statistic, determining the degrees of freedom (df), and, based on α and df, identifying the:
- Critical value(s) that will define the rejection region for the null hypothesis
- P-value, or probability of obtaining the observed results, or more extreme results, if the null hypothesis is true
Analyzing data to make a decision about the validity of the hypotheses, by either:
- Comparing the test statistic to the critical value(s); if the test statistic falls into the rejection rage, and we would reject the null hypothesis
- Comparing the p-value to α; if the p-value is less than or equal to α, we would reject the null hypothesis

Assumptions Underlying Difference of Means Tests

Samples: Related, or Unrelated?

“INDEPENDENT SAMPLES are those in which cases across the two samples are not ‘paired’ or matched in any way” (Meier, Brudney, and Bohte, 2011, p. 223). In other words, independent samples involve between-group comparisons of two unrelated groups of different individuals or cases. Observations across samples are independent: the observations in one group have no effect on the observations in the other group. Examples of two independent samples are a treatment group and a control group.

“DEPENDENT SAMPLES exist when each item in one sample is paired with an item in the second sample” (p. 223). In other words, dependent samples involve within-group comparisons of two closely related groups, which contain the same, or extremely similar, individuals or cases. Observations across samples are dependent within pairs: each pair of observations is related. Examples of dependent samples include:

two samples containing the same individuals or cases, with data collected before and after a treatment, intervention, policy change, etc.
two samples containing different individuals, with individuals in one sample matched/paired to individuals in the other sample who have similar characteristics (age, gender, race, income, education, etc.)

Nature of the Data

Is the Data Normally Distributed?

Many statistical methods, including difference of means tests, are based on the ASSUMPTION OF NORMALITY: the distribution of the data — in this context, within each sample (or the differences in paired samples) — should be approximately normal (i.e., bell-curve shaped, symmetrical around the mean). This assumption is particularly important for small sample sizes. Researchers or administrators can check for normality by using graphical methods, such as histograms and quantile-quantile (Q-Q) plots. There are also statistical tests that can check for normality, such as the Shapiro-Wilk test.

Are Variances Equal, or Unequal?

The ASSUMPTION OF EQUAL VARIANCES (i.e., homogeneity of variances) applies when the variances of two groups are assumed to be approximately equal. At times, however, the variances of two groups cannot assumed to be equal; in such situations, we proceed with the ASSUMPTION OF UNEQUAL VARIANCES (i.e., heterogeneity of variances). Determining which assumption applies is important, regardless of whether the data corresponds to samples or populations.

Whether variances are equal or unequal impacts the way in which the t-test calculation is performed. Researchers or administrators can check for homogeneity (i.e., equal variances) using statistical tests such as the F-TEST.

Types of Differences of Means Tests

There are three difference of means tests that you may use to examine the difference between two groups, depending on the samples and nature of data: the INDEPENDENT SAMPLES T-TEST, the DEPENDENT/PAIRED SAMPLES T-TEST, and the WELCH’S UNEQUAL VARIANCES T-TEST. These tests are summarized in the table below.

Types of Difference of Means Tests (t-Tests)

Type of t-Test	Samples	Nature of Data
Independent Samples t-Test	Independent samples and observations	Data in each group are normally distributed Equal variances in both groups
Dependent/Paired Samples t-Test	Dependent/paired samples; observations are dependent within pairs and independent between pairs	Differences in paired samples are normally distributed
Welch’s Unequal Variance t-Test	Independent samples and observations	Data in each group are normally distributed Unequal variance across groups