"

4.3 – Calculating Sum of Squares

There are two ways of calculating a Sum of Squares (SS):

  1. The definitional formula
  2. The computational formula.

Sum of Squares – Definitional Formula

You have technically already learned the definitional formula. It is called the “definitional” formula because it fits the definition of Sum of Squares, which is “the sum of the squared deviations.”

[latex]\text{Sum of Squares (Definitional Formula)}=SS=\Sigma(X-\mu)^2[/latex]

Remember that the symbol Σ (the Greek letter, “Sigma”) means “sum of,” while (X-μ) is a deviation (the difference between a score, X, and the mean, μ). Thus, you have the sum of the squared deviations.

To calculate the Sum of Squares using the definitional formula, follow these steps:

  • Add up all the scores in the distribution (ΣX)
  • Calculate the mean for the distribution (μ)
  • Calculate the deviations for every score (X – μ)
  • Square each of the deviations (X – μ)2
  • Add up the squared deviations Σ(X – μ)2

Suppose we had the following set of scores:

3, 4, 5, 5, 8

It’s easiest to start by putting the scores in a table and adding them up (ΣX):

X
8
5
5
4
3
[latex]\Sigma X = 25[/latex]

We can then calculate the mean (μ) for this group of scores:

[latex]\mu = \frac{\Sigma X}{N} = \frac{25}{5} = 5[/latex]

Now that we know the mean is μ = 5, we can ad on to the table to calculate the deviations, squared deviations, and Sum of Squares:

X X-μ (X-μ)2
8 8 – 5 = +3 (+3)2 = 9
5 5 – 5 = 0 (0)2 = 0
5 5 – 5 = 0 (0)2 = 0
4 4 – 5 = -1 (-1)2 = 1
3 3 – 5 = -2 (-2)2 = 4
[latex]\Sigma X = 25[/latex] Sum = 0 SS = 14

Sum of Squares – Computational Formula

Instead of using the definitional formula to calculate the Sum of Squares, we can use the computational formula of the Sum of Squares:

[latex]\text{Sum of Squares (Computational Formula)}=SS=\Sigma X^2-\frac{(\Sigma X)^2}{N}[/latex]

This formula is mathematically equivalent to the definitional formula, although it does not appear on its face to have anything to do with deviations or squared deviations. However, this formula has a distinct advantage because it makes the computations easier (which is why it is called the “computational formula”). Sometimes when calculating the Sum of Squares we encounter a mean that is a fraction (e.g., 3.177777 or 5.125). When this happens, using the definitional formula will involve a lot of calculations with decimals involved because we have to take each score and calculate its deviation by subtracting it from the mean. Each of these deviations will then add up as fractions as well. If you round up the numbers to make the calculations more workable, the Sum of Squares will end up being less accurate.

The computational formula, on the other hand, doesn’t have this problem because it never uses the mean for the distribution in the calculation. As a result, it can avoid most of the calculations with decimals until the very end so that rounding won’t be an issue.

To calculate the Sum of Squares using the computational formula, follow these steps:

  • Add up all the scores in the distribution (ΣX)
  • Square each score in the distribution (X2)
  • Add up all the squared scores (ΣX2)
  • Plug the ΣX and ΣX2 results in the computational formula of Sum of Squares

Let’s use our same data, but this time use the computational formula:

X X2
8 82 = 64
5 52 = 25
5 52 = 25
4 42 = 16
3 32 = 9
[latex]\Sigma X = 25[/latex] [latex]\Sigma X^2 = 139[/latex]

Now we just plug the ΣX and ΣX2 into the Sum of Squares computational formula:

[latex]\text{Sum of Squares (Computational Formula)}=SS=\Sigma X^2-\frac{(\Sigma X)^2}{N}[/latex]

[latex]=139 - \frac{(25)^2}{5} = 139 - \frac{625}{5} = 139 - 125 = 14[/latex]

As you can see, the Sum of Squares equals 14 with the computational formula, just like we got with the definitional formula.

Which Version of Sum of Squares Should I Use?

It can be helpful to be comfortable calculating the Sum of Squares with either formula, but it is not necessary. They each have their own advantages and disadvantages:

Definitional Formula Computational Formula
Advantages More clear what Sum of Squares is calculating Not clear what the Sum of Squares calculation means
Disadvantages Can be a mess to calculate if the mean is a fraction Easier to calculate because there are fewer calculations involving fractions

However, once you have a decent sense of what the Sum of Squares calculates, it is going to be easiest to use the computational formula for most calculations. You will see that in most of the cases in this textbook from this point on, we will more typically use the computational formula to calculate the Sum of Squares.

How is Sum of Squares Used?

For now, we are simply going to use the Sum of Squares calculation as a step in helping us calculate variance and then standard deviation. This is because standard deviation is our best descriptive statistic measure of spread, and so that is the ultimate goal. However, the Sum of Squares result itself is a decent measure of spread. In fact, later in the textbook, you will see that it is used in a number of our inferential statistics.

For now, though, we are going to use the calculation of the Sum of Squares as the first step in helping us calculate the standard deviation. So we can now move on to the next steps: calculating the variance and then calculating the standard deviation.

License

Icon for the Creative Commons Attribution 4.0 International License

Introduction to Statistics and Statistical Thinking Copyright © 2022 by Eric Haas is licensed under a Creative Commons Attribution 4.0 International License, except where otherwise noted.

Feedback/Errata

Comments are closed.