ANOVA

1. Definition

ANOVA (Analysis of Variance) is a statistical method used to test whether
the means of several groups are significantly different from each other.

In Korean, it is called 분산분석 (“variance analysis”).

Important: ANOVA is not about comparing “variances between groups” themselves,
but about using variance to test differences in group means.

ANOVA Made Simple (ANOVA & F-test)

1. What is ANOVA?

ANOVA (Analysis of Variance) = variance analysis
One-line summary:

ANOVA tests whether the means of multiple groups are statistically different.
Examples:
- Are the average taste scores of three cookie recipes different?
- Are the mean treatment effects of four medical groups the same?

2. Why is it called “Analysis of Variance”?

What we really care about is the difference in means,
but ANOVA looks at this through variance (spread).

ANOVA compares two types of variance:

Between-group variance
- How far apart the group means are from each other
- If mean differences are large → between-group variance ↑
Within-group variance
- How much scores vary inside each group
- If scores in a group are very spread out → within-group variance ↑

Put simply:

“Are the differences in group means real,
or could they just be random noise from highly variable data?”

3. Relationship with the F-test

The key test statistic in ANOVA is the F-value.

\[F = \frac{\text{Between-group variance}}{\text{Within-group variance}}\]

When F is large 👉
- Group means are far apart, and/or
- Variation within each group is relatively small
  → Suggests real differences in group means
When F is small 👉
- Differences between group means are small compared to random variation within groups
  → Mean differences may be due to chance

So:

ANOVA = using the F-test to decide whether group means are equal or not.
We use the variance (spread) within and between groups to test whether
the observed mean differences are unlikely to be due to chance.

In other words:

ANOVA is a method to use variance to provide evidence that
group means really differ.

Suppose we have three cookie recipes:

Recipe A (Baek Jong-won)
Recipe B (Lee Yeon-bok)
Recipe C (Choi Hyun-seok)

Friends rate each cookie from 0 to 10.

We compute the average score for each recipe.
- Maybe Recipe A has the highest mean score.
However:
- Recipe B might get either 0 or 10 (very polarized opinions).
- In that case, the mean alone can be misleading.

So instead of only looking at means, we:

Compare the differences in recipe means (between-group variance), and
Consider how spread out the scores are within each recipe (within-group variance),
Then compute the F-value.

Next, we compare this F-value to the F distribution to decide:

If the p-value is small →

“At least one recipe has a significantly different mean score.” (significant difference)
If the p-value is large →

“Any differences in mean scores could just be due to random variation.” (no significant difference)

Saying we have “found the best recipe” means:

There is a real difference between recipe means.

But if, for example, Recipe B has very large variation within the group
(some give 0, some give 10), then:

The average can be distorted, and
Our claim “Recipe A is the best” becomes less convincing.

So:

If between-group variance is large
and within-group variance is small →
F is large → strong evidence of real differences between recipes.
If between-group variance is small
or within-group variance is very large →
F is small → differences might just be noise.

In words:

A large F-value means that the difference between group means is large enough
(relative to the within-group variation) that we can say
“there is a meaningful difference between recipes.”

To formally support the statement:

“Baek Jong-won’s recipe is truly the best,”

we need:

A large F statistic, and
A small p-value that allows us to reject the null hypothesis
(“all recipe means are equal”).

Only then does our claim gain strong statistical support.

PREVIOUSNull hypothesis

NEXTLinear Regression