Thursday, July 17, 2025

Kurtosis: what the tails of distributions tell us

 


When we think of a distribution in general, it’s common to picture the normal distribution, also known as the Gaussian distribution — symmetric, bell-shaped, and known for its smooth and regular appearance (Figure 1). This distribution is referred to as mesokurtic, and it serves as the reference for studying kurtosis.

Figure 1

In another post, we discussed skewness. Now, let’s explore another important aspect: kurtosis, which relates to the tails of distributions.

In a distribution graph, the tails are the outer ends on both sides of the central peak. They represent how frequently extremely high or low values occur. Unlike skewness, which measures asymmetry, Kurtosis focuses on outliers. In other words, it describes how much data lies in the tails. However, kurtosis is often mistakenly linked to the ‘peakedness’ of a distribution, but it’s primarily about tail behavior. Two distributions can have identical peaks but drastically different kurtosis due to their tails."

Using the normal distribution as a reference, we can compare other distributions to see whether they have more or less data in the tails.

Three main types of kurtosis


1. Mesokurtic: This is the case of the normal distribution itself, which has an average amount of data in the tails. Some distributions, like the binomial distribution with a probability near ½ and a large sample size, can also be considered mesokurtic.


2. Leptokurtic: These distributions have heavier tails than the normal. This means more extreme values occur, and outliers are more likely. A classic example is the Student's t-distribution.


3. Platykurtic: These distributions have lighter or thinner tails, or in some cases, nearly no tails at all. That indicates fewer extreme values. The uniform distribution is an example of platykurtic behavior.

 

Real-world analogies:

  •     Leptokurtic: Stock market returns (frequent extreme crashes/booms).
  •     Platykurtic: Human height (few extreme outliers).

                                                                        Figure 2

The origin of the concept


Karl Pearson introduced the concept of kurtosis and associated it with the "flatness" or "peakedness" of a distribution. In this view, flatter curves were platykurtic and sharply peaked curves were leptokurtic.
                   

However, this interpretation is not entirely accurate. Kurtosis is more about the weight of the tails — what happens at the extremes — than about the height or shape of the central peak. In fact, the peak contributes very little to the kurtosis value.

How is kurtosis measured?

Unlike the mean and standard deviation, which use the same units as the data, kurtosis is a dimensionless measure. There are two main ways to express it:

      1. Absolute kurtosis: Also known as the Pearson kurtosis coefficient, in which the normal distribution has a kurtosis of 3. It can be calculated by 

         Where

• μ4 is the fourth moment about the mean, that is, E[(X−μ4)]

• μ is the standard deviation (in the definition, it is the population standard deviation).


      2. Excess kurtosis: Also known as the Fisher kurtosis coefficient, it is the more commonly used                      version today, including in most statistical software. Here, the normal distribution has kurtosis                equal to 0. It can be calculated by 

 Kurtosis in small samples

In small samples, calculated kurtosis tends to overestimate the population kurtosis. This happens because the formula is based on statistical moments, which are very sensitive to extreme values. For example, in a sample of 4 values like [1, 3, 6, 10], one outlier (10) heavily skews the kurtosis upward. Softwares correct this using adjusted formulas (e.g., Fisher’s or a bias correction)



Examples

    1. Data: 1, 3, 6, 10

    Goal: Compute mean, variance, standard deviation, and kurtosis.

                 Values for kurtosis calculation

 



                       The kurtosis for this dataset is negative, which characterizesa platykurtic                                         distribution - with lighter tails than the normal.

                                        Step 1: Let's calculate kurtosis applying the bias correction.






             2. Data:                 

             Goal: Compute mean, variance, standard deviation, and kurtosis (EXCEL).



               Figure 3

Most data cluster at 7, but the few extreme values (4, 5, 6, 8, 9, 10) create "heavy tails.

The distribution is leptokurtic. It has a high positive kurtosis, indicating that it is very peaked and has a relatively large number of outliers.


3. Data:

Data

4

5

6

7

8

9

10

Frequency

2

2

4

4

4

2

2


               
            Goal: Compute mean, variance, standard deviation, and kurtosis (EXCEL).

                                                            Mean

7

                                                 Standard deviation

    1.777

                                                          Variance

    3.158

                                                           Kurtosis

         -0.671


                   Data are uniformly distributed—no sharp peak or extreme values.


               The distribution is platykurtic. It has a low negative kurtosis, indicating that it                                          is relatively flat and has fewer outliers than a normal distribution.







No comments: