You probably know the mean,
which is a measure of central tendency in a dataset. But the mean doesn’t tell
the whole story.
For example: the
average yearly spending of a person does not explain possible excesses on
certain days or lack of money at the end of some months.
In science and social
sciences, it’s important to know how much the data vary around the mean.
The less they vary, the better the mean represents the whole dataset.
When the Mean Misleads
Imagine two
situations:
·
Ages of children in a preschool class:
3; 4; 3; 5; 5
The mean is 4, which represents the group well.
·
Ages of students in an adult literacy class:
45; 19; 83; 55; 43
The mean is also 49, but here it does not represent the group, because
the ages are very spread out.
So, data can be
tightly clustered or widely spread around the mean, and we need a measure
to capture that.
First Attempt: Mean of
the Deviations
One idea is to compute
the mean of the deviations from the mean.
But this doesn’t work: positive and negative deviations cancel out, and the
result is always zero.
Example:
Data = 14; 14; 6; 6
Mean = 10
Deviations = +4; +4; –4; –4
Sum = 0
Second Attempt:
Absolute Deviations
What if we use absolute
values of the deviations?
Example 1:
14; 14; 6; 6 → mean = 10
Sum of absolute deviations = |4|+|4|+|–4|+|–4| = 16
Mean absolute deviation = 4
Example 2:
17; 11; 4; 8 → mean = 10
Sum of absolute deviations = |7|+|1|+|–6|+|–2| = 16
Mean absolute deviation = 4
But notice: the second
set of numbers is more spread out, yet the value is the same. So this method fails.
The Solution: Standard
Deviation
The next step is to
square each deviation from the mean. That way:
1. Negative values
disappear (all become positive).
2. Larger deviations
carry more weight.
3. The math becomes
easier, allowing smooth algebraic manipulation later.
Then, we compute the mean
of these squared deviations and take the square root.
Example 1
14; 14; 6; 6 → mean = 10
Squared deviations = 16; 16; 16; 16
Average = 64/4 = 16
Square root = 4
Example 2
17; 11; 4; 8 → mean = 10
Squared deviations = 49; 1; 36; 4
Average = 90/4 = 22.5
Square root = 4.74
Now it works! The
second set shows greater spread, and the number reflects that.
What We Learned
·
The standard deviation is never negative.
·
It is larger when data are more spread out.
·
It gives more importance to values far from the mean.
·
It’s a robust measure, widely used in statistics, social sciences,
economics, health, and many other fields.
📌 In short: the standard deviation is the ruler that measures how far data
stray from the center. It shows when the mean is reliable — and when it can be
misleading.
No comments:
Post a Comment