"

8.9 – Step 5: Make a Statistical Decision

5. Make a Statistical Decision

In this step, we will combine some of the previous steps. Specifically, we will take the result of Step 4, where we calculated the test statistic for our sample mean, which resulted in a z-score of z = -2.25, and see if that z-score falls in the critical region we determined in Step 2.

Essentially, in order to answer the question, “Is this sample mean substantially different from what we expect?” we will use the “line in the sand” that we drew in Step 2 and see if the test statistic for our sample mean crossed that line. In other words, is our test statistic in the critical region?

Let’s use the directional hypothesis with an alpha of α = 0.05 example from Step 2, where we determined that the “line in the sand” that determined our critical region was the critical value of zcritical = -1.65:

Distribution of sample means with a mean of 50, standard error of 1, and a shaded tail to the left depicting the 5% extremely low sample means, with an additional horizontal axis for the z-scores, with a tick mark at the cutoff for the 5% labeled -1.65

This graph shows us that any sample means that have a z-score greater than -1.65 (sample means to the right of -1.65)  do not fall into the extreme 5% of low scores. As a result, a sample mean with a z-score greater than -1.65 would not be considered substantially different from what the null hypothesis would predict. Based on how that sample of participants performed, we would then make the educated guess that their sample mean was probably due to sampling error. In other words, it seems to support the null hypothesis.

On the other hand, any sample mean that has a z-score less than or equal to -1.65 (to the left of -1.65) does fall into the extreme 5% of low scores. As a result, a sample mean with a z-score less than or equal to -1.65 would be considered substantially different from what the null hypothesis would predict. Based on how that sample of participants performed, we would then make the educated guess that their sample mean was probably not due to sampling error. In other words, it seems to conflict with the null hypothesis.

What is nice about this step (but also one of the concerns about NHST) is that the researcher’s decision is black-and-white. If the test statistic calculated in Step 4 does not cross the line(s) determined in Step 2, you make one decision, but if it does cross the line(s), you make another decision.

What is difficult about this step is that the wording of the decisions is somewhat confusing due to the use of double-negatives and triple-negatives. The two possible decisions that the research can make are:

  • Reject the Null Hypothesis
  • Fail to Reject the Null Hypothesis

Reject the Null Hypothesis

We will reject the null hypothesis (or “reject the null”) when the test statistic is in the critical region. In other words, the sample mean was substantially different from what the null hypothesis would predict, and thus, we conclude that the null hypothesis is probably wrong.

Because the null hypothesis usually argues that there is no effect, difference, or correlation, we are, in essence, saying that our data may support the idea of an effect, difference, or correlation.

The phrase “reject the null hypothesis” is a double-negative. It can be understood to be saying: “Say no (reject) to there being no effect, difference, or correlation (null hypothesis).” We call this a double-negative because there are two “no’s” in the sentence. Those two “no’s” can essentially be seen to cancel each other out, and we are left with: “Say there is an effect, difference, or correlation.”

However, we can’t actually argue that we’ve proved that there is an effect, difference, or correlation. At best, we could say that our data indicates that we shouldn’t dismiss the idea that there might be an effect, difference, or correlation. Further research on the alternative hypothesis would be warranted.

Fail to Reject the Null Hypothesis

We will fail to reject the null hypothesis (or “retain the null”) when the test statistic is not in the critical region. In other words, the sample mean was not substantially different from what the null hypothesis would predict, and thus, we conclude that the null hypothesis is probably correct.

Because the null hypothesis usually argues that there is no effect, difference, or correlation, we are, in essence, saying that our data may support the idea that there is no effect, difference, or correlation.

The phrase “fail to reject the null hypothesis” is a triple-negative. It can be understood to be saying: “Say no (fail) to saying no (reject) to there being no effect, difference, or correlation (null hypothesis).” We call this a triple-negative because there are three “no’s” in the sentence. Two of those “no’s” can essentially be seen to cancel each other out, and we are left with: “Say no to there being an effect, difference, or correlation.”

However, we can’t actually argue that we’ve proved that there is no effect, difference, or correlation. At best, we could say that our data indicates that there might not be an effect, difference, or correlation. Further research on the alternative hypothesis would be questionable unless there were flaws in the research methodology.

Directional or One-Tailed Hypothesis Decisions

If a researcher is using a directional or one-tailed hypothesis, there will only be one critical region. As discussed in Step 2, the location of that tail (to the left or to the right) will be determined by the alternative hypothesis.

For example, in our example study exploring the impact of sleep deprivation on memory, the alternative hypothesis predicts that sleep deprivation will reduce memory. Thus, the tail would be to the left. Thus, we would end up with the following regions:

Distribution of sample means with a mean of 50, standard error of 1, and a shaded tail to the left depicting the 5% extremely low sample means, with an additional horizontal axis for the z-scores, with a tick mark at the cutoff for the 5% labeled -1.65. The side to the left of -1.65 is labeled the "critical region" and indicates that you would conclude "reject the null" and "probably an effect, difference, or correlation." It also indicates you would report that the result is "statistically significant" and "p < alpha." The side to the right of -1.65 is labeled the "non-critical region" and indicates that you would conclude "fail to reject the null" and "probably not an effect, difference, or correlation." It also indicates you would report that the result is "not statistically significant" and "p > alpha."

As you can see in this picture, if you end up with a sample mean that has a z-score that is greater than -1.65 (did not cross “the line in the sand”), that would tell you that your sample is consistent with the null hypothesis, and you would “fail to reject the null hypothesis” or “retain the null.”

On the other hand, if you end up with a sample mean that has an extremely low z-score that is less than or equal to -1.65 (crossed “the line in the sand”), that would tell you that your sample is inconsistent with the null hypothesis, and you would “reject the null hypothesis” or “reject the null.”

Remember that directional or one-tailed hypotheses specify the direction of an effect, difference, or correlation, so it is also possible to specify a positive effect, difference, or correlation, resulting in a tail to the right:

Distribution of sample means with a mean of 50, standard error of 1, and a shaded tail to the left depicting the 5% extremely high sample means, with an additional horizontal axis for the z-scores, with a tick mark at the cutoff for the 5% labeled -1.65. The side to the left of -1.65 is labeled the "critical region" and indicates that you would conclude "reject the null" and "probably an effect, difference, or correlation." It also indicates you would report that the result is "statistically significant" and "p < alpha." The side to the right of -1.65 is labeled the "non-critical region" and indicates that you would conclude "fail to reject the null" and "probably not an effect, difference, or correlation." It also indicates you would report that the result is "not statistically significant" and "p > alpha."

As you can see in this picture, if you end up with a sample mean that has a z-score that is less than +1.65 (did not cross “the line in the sand”), that would tell you that your sample is consistent with the null hypothesis, and you would “fail to reject the null hypothesis” or “retain the null.”

On the other hand, if you end up with a sample mean that has an extremely high z-score that is more than or equal to +1.65 (crossed “the line in the sand”), that would tell you that your sample is inconsistent with the null hypothesis, and you would “reject the null hypothesis” or “reject the null.”

Non-Directional or Two-Tailed Hypothesis Decisions

If research is using a non-directional or two-tailed hypothesis, there will be two critical regions. Those two tails will be at both extremes (the extreme low scores and the extreme high scores).

For example, in our example study exploring the impact of sleep deprivation on memory, the non-directional or two-tailed alternative hypothesis predicts that sleep deprivation will impact memory. That means that it can either increase memory or it can decrease memory. Thus, the tails will include both of those extremes. Thus, we would end up with the following regions:

Distribution of sample means with a mean of 50, standard error of 1, and two shaded tails to the left and right depicting the 2.5% extremely low sample means and the 2.5% extremely high sample means, with an additional horizontal axis for the z-scores, with a tick mark at the cutoff for the two tails labeled -1.65. The side to the left of at - 1.96 and +1.96 are labeled the "critical region" and indicates that you would conclude "reject the null" and "probably an effect, difference, or correlation." It also indicates you would report that the result is "statistically significant" and "p < alpha." The middle between -1.96 and +1.96 is labeled the "non-critical region" and indicates that you would conclude "fail to reject the null" and "probably not an effect, difference, or correlation." It also indicates you would report that the result is "not statistically significant" and "p > alpha."

As you can see in this picture, if you end up with a sample mean that has a z-score that is greater than -1.96 or less than +1.96 (did not cross either of “the lines in the sand”), that would tell you that your sample is consistent with the null hypothesis, and you would “fail to reject the null hypothesis” or “retain the null.”

On the other hand, if you end up with a sample mean that has an extreme z-score that is less than or equal to -1.65 (crossed “the line in the sand”) or more than or equal to +1.65 (crossed “the line in the sand”), that would tell you that your sample is inconsistent with the null hypothesis, and you would “reject the null hypothesis” or “reject the null.”

License

Icon for the Creative Commons Attribution 4.0 International License

Introduction to Statistics and Statistical Thinking Copyright © 2022 by Eric Haas is licensed under a Creative Commons Attribution 4.0 International License, except where otherwise noted.

Feedback/Errata

Leave a Reply

Your email address will not be published. Required fields are marked *