Wednesday, July 02, 2025

Statistical Analysis of Experimental Data: Tukey’s HSD Test

 

The first step in the statistical analysis of experimental data is usually an Analysis of Variance (ANOVA), provided its assumptions are met. The hypothesis tested by ANOVA is the equality of population means across several groups:

                                H₀: μ₁ = μ₂ = ... = μₖ
                                H₁: At least two means differ.

However, ANOVA does not identify which specific groups have significantly different means.

When ANOVA yields a significant result, researchers typically turn to a multiple comparisons test to evaluate pairwise differences among group means. In this post, we will discuss one such test—Tukey’s test, likely the most commonly used in this context.

Key Features of Tukey’s HSD Test

Tukey’s HSD test allows for all possible pairwise comparisons (unplanned comparisons), meaning researchers do not need to pre-specify which comparisons will be made during the experimental design phase. For this reason, Tukey’s test is considered a post hoc test.

Procedure for Tukey’s HSD Test

To apply Tukey’s HSD test, we calculate the Honestly Significant Difference (HSD) between two means—the smallest difference required for them to be considered statistically different at a given significance level. The HSD is given by:

HSD=q(k,dfres,α)×MSEr

Where:

  • q(k, df, α) = Studentized range statistic (from tables, based on the number of groups k, residual degrees of freedom df, and significance level α).

  • MSE = Mean Square Error from ANOVA.

  • r = Number of replicates per group.

Two means are considered significantly different (at the chosen significance level) if their absolute difference is greater than or equal to the HSD.

How to Use the Studentized Range (q) Table

Below is an excerpt from the q-table. The bolded value corresponds to a comparison involving six treatments (k=6) and 24 residual degrees of freedom (df=24) at a 5% significance level (α=0.05).

    Table Values of q for α=5%               

💡 Note:

  • In statistical software and English literature, the term HSD (Honestly Significant Difference) is commonly used.

  • In Brazil, this is often called the minimum significant difference (represented by the Greek letter Δ).

  • The term Least Significant Difference (LSD) refers specifically to Fisher’s test, whereas Tukey’s test uses HSD, a name coined by its creator, John W. Tukey.

Example: Blood Pressure Reduction Study

Consider the blood pressure reduction data in Table 1, analyzed using ANOVA (Table 2). The F-test was significant at the 5% level, indicating that at least one mean differs from the others. Group means are shown in Table 3.

Table 1: Blood pressure reduction (mmHg)

Table 2: ANOVA

Table 3: Means of blood pressure reduction by group

Our goal is to identify which means differ significantly using Tukey’s HDS test.

HSD Calculation

  • q = 4.3727 (from q-table, k=6, df=24, α=5%)

  • MSE = 36.00 (Mean Square Error from ANOVA)

  • r = 5 (replicates per group)

HSD=4.3727×36.005=11.72mmHg

Pairwise Comparisons

We compare the means pairwise, marking significant differences (at α=5%) with an asterisk (*).


Conclusion

According to Tukey’s HSD test (α=5%):

  • The mean of Treatment A was significantly higher than those of B and the Control.

  • The mean of Treatment D was significantly higher than those of B, C, E, and the Control.

Sunday, June 01, 2025

Accuracy and precision in measurements

 

Accuracy refers to the degree of agreement between the result of a measurement and the true value of the quantity being measured. The more accurate the system, the closer the result is to the true value.

In practice, the true value is not known. Accuracy therefore describes how close the measurement result is to a reference value, a standard, or a recognized measurement technique.

Precision refers to the degree to which repeated measurements of the same quantity yield similar results. These values differ due to random errors. The greater the precision, the smaller the dispersion of the data.

It is important to understand that accuracy and precision are different concepts. Classic illustrations of the distinction between accuracy and precision:


a. Points scattered and far from the center indicate neither accuracy nor precision.

b. Points scattered but centered on average indicate accuracy without precision.

c. Points clustered together but far from the center indicate precision without accuracy.

d. Points clustered tightly at the center of the target indicate both accuracy and precision.

 

Assessing accuracy and precision

 To asses accuracy and precision, the same quantity must be measured repeatedly:

🔺 Accuracy is assessed by comparing the mean of the measurement results to the true value.
🔺Precision is assessed by the standard deviation of the measurements.

                                                Example

You measured a part three times and obtained:
                                    15.0 in; 15.1 in; 14.9 in
There is precision, and you can write: (15.0 ± 0.01) in
But if the true value is 40.0 cm, is it accurate?
Since 1 inch = 2.54 cm, then 15.0 × 2.54 = 38.1 cm → Lacks accuracy.


                            Accuracy and precision are distinct attributes

🔺 Accuracy indicates how close the measurements are to the true (or reference) value.
🔺Precision indicates how close the measurements are to each other, even if they are far from the true value.

                                             Example 

Let’s consider another case of experimental measurements. A test sample with a true mass of 100 mg is measured.

 If the results are:

                                         98.5; 98.6; 98.7; 98.5

the measurements are precise, but not accurate.


If the results are:

                                       99.6; 101.6; 99.6; 100.5

The measurements are accurate, but not precise.


Important note


Instruments are usually precise, but due to wear or mishandling, they may no longer be calibrated. Therefore, don’t assume a result is correct just because it is precise – it also needs to be accurate.

To assess accuracy and precision in measurement results, two statistics are used: bias and standard deviation.

🔺 Bias is the difference between the reference value and the mean of the obtained measurements (under the same conditions).
🔺Standard deviation, denoted by s, is a measure of data dispersion within a sample.

                                        Example (revisited)

     You measured a part three times and obtained:

                                    15.0 in; 15.1 in; 14.9 in

The mean is 15.0 in. The true value is 40.0 cm = 15.7 in
                                       Bias = 15.7 – 15.0 = 0.7

To calculate the standard deviation, which measures the spread of the values, you can use Excel, but the manual calculation is shown in the table below, remembering that deviation means the difference between each observed value and the mean.The mean of the measurements is 15. Then, you have:


Theoretical Illustration

      Now, look at the theoretical illustrations below, which represent measurement errors:


🔺 In the first figure, you see the results of an infinite number of measurements made by two operators. Both obtained accurate results (bias equals zero), but the one represented by the red curve showed greater precision than the one represented by the black curve.


🔺 In the second figure, you also see results from two operators. Both achieved equal precision, but the accuracy of the operator represented by the right-hand curve is higher — assuming the true value is marked in red.

 


Saturday, May 31, 2025

The Importance of Resolution in Evaluating Measurement Uncertainty

Resolution is the smallest variation that can be detected by a measuring instrument. For example, a measuring tape with markings every 1 cm has a resolution of 1 cm — meaning it registers changes in 1 cm increments.

Resolution represents the smallest measurable increment: it is the minimum difference that can produce a detectable change in the reading.

If you try to measure the diameter of a keyhole using a measuring tape, you will get the same result every time — even after 100 repetitions — because the resolution of the tape is not sufficient to detect such small variations.

Exemples:

       ·        If the instrument measures in 1-unit steps, any value between 6.5 and 7.5 will be recorded as 7.                                                                                 

                                                                              

        ·        If it measures in 2-unit steps, values between 7 and 9 will be recorded as 8.

  

🔎 Measurement Uncertainty

In a previous post (Type B Evaluation of Measurement Uncertainty), we explained that a measurement should be reported as:

                                                         (X ± ΔX) unit

Where:

             ·        X is the best estimate of the measured value;

             ·        ΔX is the associated uncertainty.

This means that if the measurement is repeated under the same conditions, the result is expected to fall within the range:

                                                 (X - ΔX) to (X + ΔX)

To evaluate the uncertainty of a single measurement, it is essential to consider the resolution of the instrument used.


👶 Example: length of a newborn

Suppose you know only that a newborn measured 50 cm in length. That information alone is not enough to estimate the uncertainty.

However, if you also know that the measurement was taken using a ruler graduated in centimeters, you can state that the actual length was between 49.5 cm and 50.5 cm. The result should be expressed as:

                                                          (50.0 ± 0.5) cm

In this case, the uncertainty arises solely from the instrument’s limitation, assuming it is properly calibrated. Because only one measurement was taken, the uncertainty is not statistical and is classified as Type B.

 


⚖️ Example: mass of the newborn

Now suppose the scale used has divisions every 10 g (0.01 kg). If the reading is 3.54 kg, the actual mass is between 3.535 kg and 3.545 kg.

This measurement should be written as:

                                                           (3.54 ± 0.005) kg


📐 Absolute and Relative Uncertainty

In the expression X ± ΔX, the term ΔX is called the absolute uncertainty (formerly known as the absolute error).

Relative uncertainty tells us how significant the absolute uncertainty is compared to the measured value. It is calculated as:


🚗 Example: speedometer reading

Let’s say a car speedometer is marked in increments of 2 km/h. If you read 60 km/h:

             ·        The absolute uncertainty is half the smallest division: 1 km/h

             ·        The relative uncertainty is:

Relative uncertainty is dimensionless, because the units cancel out. This makes it especially useful for comparing the precision of different physical quantities.


📊 Example: mass or length?

Back to the newborn: which measurement had greater uncertainty?

     Length:

     Mass:

➡️ The length measurement had greater relative uncertainty.


These definitions apply to Type B uncertainty, but are equally valid in the context of Type A uncertainty, which is based on statistical analysis.


✏️ Practice Exercises

          1.     You measured a child’s height with a ruler graduated in centimeters and got 80            cm. What is the uncertainty?

          2.     You measured a child’s temperature with a thermometer marked in 2 °C                       increments and got 38 °C. What is the uncertainty?

Answers

            1.     Height = (80 ± 0.5) cm
            Relative uncertainty = (0.5 / 80) × 100 = 0.625%

           2.     Temperature = (38 ± 1) °C
            Relative uncertainty = (1 / 38) × 100 = 2.63%