Chapter 17: Linear Regression

Alisa Beyer

16 Chapter 17: Linear Regression

Alisa Beyer

In chapter 14, we learned about ANOVA, which involves a new way a looking at how our data are structured and the inferences we can draw from that. In chapter 16, we learned about correlations, which analyze two continuous variables at the same time to see if they systematically relate in a linear fashion. In this chapter, we will combine these two techniques in an analysis called simple linear regression, or regression for short. Regression uses the technique of variance partitioning from ANOVA to more formally assess the types of relations looked at in correlations. Regression is the most general and most flexible analysis covered in this book, and we will only scratch the surface.

A major practical application of statistical methods is making predictions. Psychologists often call this kind of prediction regression. Regression literally means going back or returning. We use the term regression here because the predicted score on the criterion variable is closer (in terms of standard deviation units) to the mean of the criterion variable compared to the distance from the value of the predictor variable to the mean of the predictor variable. So we can think of this in terms of the predicted value of the criterion variable regressing, or going back, toward the mean of the criterion variable. Again, the concepts in this chapter are directly related to correlation. This is because if two variables are correlated it means that we can predict one from the other. So if sleep the night before is correlated with happiness the next day, this means that we should be able, to some extent, predict how happy a person will be the next day from knowing how much sleep the person got the night before. The concepts in the chapter are also related to ANOVA as the goal of regression is the same as the goal of ANOVA: to take what we know about one variable (X) and use it to explain our observed differences in another variable (Y) – we are just two continuous variables.

Line of Best Fit

In correlations, we referred to a linear trend in the data. That is, we assumed that there was a straight line we could draw through the middle of our scatterplot that would represent the relation between our two variables, X and Y. Regression involves solving for the equation of that line, which is called the Line of Best Fit.

The line of best fit can be thought of as the central tendency of our scatterplot. The term “best fit” means that the line is as close to all points (with each point representing both variables for a single person) in the scatterplot as possible, with a balance of scores above and below the line. This is the same idea as the mean, which has an equal weighting of scores above and below it and is the best singular descriptor of all our data points for a single variable.

We have already seen many scatterplots in chapters 3 and 16, so we know by now that no scatterplot has points that form a perfectly straight line. Because of this, when we put a straight line through a scatterplot, it will not touch all of the points, and it may not even touch any! This will result in some distance between the line and each of the points it is supposed to represent, just like a mean has some distance between it and all of the individual scores in the dataset.

The distances between the line of best fit and each individual data point go by two different names that mean the same thing: errors and residuals. The term “error” in regression is closely aligned with the meaning of error in statistics (think standard error or sampling error); it does not mean that we did anything wrong, it simply means that there was some discrepancy or difference between what our analysis produced and the true value we are trying to get at it. The term “residual” is new to our study of statistics, and it takes on a very similar meaning in regression to what it means in everyday parlance: there is something left over. In regression, what is “left over” – that is, what makes up the residual – is an imperfection in our ability to predict values of the Y variable using our line. This definition brings us to one of the primary purposes of regression and the line of best fit: predicting scores.

Prediction

The goal of regression is the same as the goal of ANOVA: to take what we know about one variable (X) and use it to explain our observed differences in another variable (Y). In ANOVA, we talked about – and tested for – group mean differences, but in regression we do not have groups for our explanatory variable; we have a continuous variable, like in correlation. Because of this, our vocabulary will be a little bit different, but the process, logic, and end result are all the same.

In regression, we most frequently talk about prediction, specifically predicting our outcome variable Y from our explanatory variable X, and we use the line of best fit to make our predictions. Let’s take a look at the equation for the line, which is quite simple.

Regression equation

Ŷ = a + bX

The terms in the equation are defined as:

Ŷ: the predicted value of Y for an individual person a: the intercept of the line

b: the slope of the line

X: the observed value of X for an individual person

Additionally we have formulas for a and b:

What this shows us is that we will use our known value of X for each person to predict the value of Y for that person. The predicted value, Ŷ, is called “y-hat” and is our best guess for what a person’s score on the outcome is. Notice also that the form of the equation is very similar to very simple linear equations that you have likely encountered before and has only two parameter estimates: an intercept (where the line crosses the Y-axis) and a slope (how steep – and the direction, positive or negative – the line is). These are parameter estimates because, like everything else in statistics, we are interested in approximating the true value of the relation in the population but can only ever estimate it using sample data. We will soon see that one of these parameters, the slope, is the focus of our hypothesis tests (the intercept is only there to make the math work out properly and is rarely interpretable).

It is very important to point out that the Y values in the equations for a and b are our observed Y values in the dataset, NOT the predicted Y values (Ŷ) from our equation for the line of best fit. Thus, we will have 3 values for each person: the observed value of X (X), the observed value of Y (Y), and the predicted value of Y (Ŷ). You may be asking why we would try to predict Y if we have an observed value of Y, and that is a very reasonable question. The answer has two explanations: first, we need to use known values of Y to calculate the parameter estimates in our equation, and we use the difference between our observed values and predicted values (Y – Ŷ) to see how accurate our equation is; second, we often use regression to create a predictive model that we can then use to predict values of Y for other people for whom we only have information on X.

Applied examples for using regression

Example 1: Businesses often have more applicants for a job than they have openings available, so they want to know who among the applicants is most likely to be the best employee. There are many criteria that can be used, but one is a personality test for conscientiousness, with the belief being that more conscientious (more responsible) employees are better than less conscientious employees. A business might give their employees a personality inventory to assess conscientiousness and existing performance data to look for a relation. In this example, we have known values of the predictor (X, conscientiousness) and outcome (Y, job performance), so we can estimate an equation for a line of best fit and see how accurately conscientious predicts job performance, then use this equation to predict future job performance of applicants based only on their known values of conscientiousness from personality inventories given during the application process.

Example 2: Assume a researcher is interested in examining whether SAT scores can be an accurate predictor of college GPA. In this case, SAT scores would be the predictor variable or X and college GPA would be the criterion variable or Y.

The key assessing whether a linear regression works well is the difference between our observed and known Y values and our predicted Ŷ values. As mentioned in passing above, we use subtraction to find the difference between them (Y – Ŷ) in the same way we use subtraction for deviation scores and sums of squares. The value (Y – Ŷ) is our residual, which, as defined above, is how close our line of best fit is to our actual values. We can visualize residuals to get a better sense of what they are by creating a scatterplot and overlaying a line of best fit on it, as shown in Figure 1.

Figure 1. Scatterplot with residuals

In figure 1, the triangular dots represent observations from each person on both X and Y and the dashed bright red line is the line of best fit estimated by the equation Ŷ = a + bX. For every person in the dataset, the line represents their predicted score. The dark red bracket between the triangular dots and the predicted scores on the line of best fit are our residuals (they are only drawn for four observations for ease of viewing, but in reality there is one for every observation); you can see that some residuals are positive and some are negative, and that some are very large and some are very small. This means that some predictions are very accurate and some are very inaccurate, and the some predictions overestimated values and some underestimated values. Across the entire dataset, the line of best fit is the one that minimizes the total (sum) value of all residuals. That is, although predictions at an individual level might be somewhat inaccurate, across our full sample and (theoretically) in future samples our total amount of error is as small as possible.

We call this property of the line of best fit the Least Squares Error Solution. This term means that the solution – or equation – of the line is the one that provides the smallest possible value of the squared errors (squared so that they can be summed, just like in standard deviation) relative to any other straight line we could draw through the data.

Predicting Scores and Explaining Variance

We have now seen that the purpose of regression is twofold: we want to predict scores based on our line and, as stated earlier, explain variance in our observed Y variable just like in ANOVA. These two purposes go hand in hand, and our ability to predict scores is literally our ability to explain variance. That is, if we cannot account for the variance in Y based on X, then we have no reason to use X to predict future values of Y.

We know that the overall variance in Y is a function of each score deviating from the mean of Y (as in our calculation of variance and standard deviation). So, just like the red brackets in figure 1 representing residuals, given as (Y – Ŷ), we can visualize the overall variance as each score’s distance from the overall mean of Y, given as (Y – ̅Y), our normal deviation score. This is shown in figure 2.

Figure 2. Scatterplot with residuals and deviation scores.

In figure 2, the solid blue line is the mean of Y, and the blue brackets are the deviation scores between our observed values of Y and the mean of Y. This represents the overall variance that we are trying to explain. Thus, the residuals and the deviation scores are the same type of idea: the distance between an observed score and a given line, either the line of best fit that gives predictions or the line representing the mean that serves as a baseline. The difference between these two values, which is the distance between the lines themselves, is our model’s ability to predict scores above and beyond the baseline mean; that is, it is our models ability to explain the variance we observe in Y based on values of X. If we have no ability to explain variance, then our line will be flat (the slope will be 0.00) and will be the same as the line representing the mean, and the distance between the lines will be 0.00 as well.

We now have three pieces of information: the distance from the observed score to the mean, the distance from the observed score to the prediction line, and the distance from the prediction line to the mean. These are our three pieces of information needed to test our hypotheses about regression and to calculate effect sizes. They are our three Sums of Squares, just like in ANOVA. Our distance from the observed score to the mean is the Sum of Squares Total, which we are trying to explain. Our distance from the observed score to the prediction line is our Sum of Squares Error, or residual, which we are trying to minimize. Our distance from the prediction line to the mean is our Sum of Squares Model, which is our observed effect and our ability to explain variance. Each of these will go into the ANOVA table to calculate our test statistic.

ANOVA Table

Our ANOVA table in regression follows the exact same format as it did for ANOVA (hence the name). Our top row is our observed effect, our middle row is our error, and our bottom row is our total. The columns take on the same interpretations as well: from left to right, we have our sums of squares, our degrees of freedom, our mean squares, and our F statistic.

Source	SS	df	MS	F
Model	∑(Ŷ − ̅Y)2	1	SS_M/df_M	MS_M/MS_E
Error	∑(Y − Ŷ)2	n-2	SS_E/df_E
Total	∑(Y − ̅Y)²	n-1

As with ANOVA, getting the values for the SS column is a straightforward but somewhat arduous process. First, you take the raw scores of X and Y and calculate the means, variances, and covariance using the sum of products table introduced in our chapter on correlations. Next, you use the variance of X and the covariance of X and Y to calculate the slope of the line, b, the formula for which is given above. After that, you use the means and the slope to find the intercept, a, which is given alongside b. After that, you use the full prediction equation for the line of best fit to get predicted Y scores (Ŷ) for each person. Finally, you use the observed Y scores, predicted Y scores, and mean of Y to find the appropriate deviation scores for each person for each sum of squares source in the table and sum them to get the Sum of Squares Model, Sum of Squares Error, and Sum of Squares Total. As with ANOVA, you won’t be required to compute the SS values by hand, but you will need to know what they represent and how they fit together. The other columns in the ANOVA table are all familiar. The degrees of freedom column still has N – 1 for our total, but now we have N – 2 for our error degrees of freedom and 1 for our model degrees of freedom; this is because simple linear regression only has one predictor, so our degrees of freedom for the model is always 1 and does not change. The total degrees of freedom must still be the sum of the other two, so our degrees of freedom error will always be N – 2 for simple linear regression. The mean square columns are still the SS column divided by the df column, and the test statistic F is still the ratio of the mean squares. Based on this, it is now explicitly clear that not only do regression and ANOVA have the same goal but they are, in fact, the same analysis entirely. The only difference is the type of data we feed into the predictor side of the equations: continuous for regression and categorical for ANOVA.

Hypothesis Testing in Regression

Regression, like all other analyses, will test a null hypothesis in our data. In regression, we are interested in predicting Y scores and explaining variance using a line, the slope of which is what allows us to get closer to our observed scores than the mean of Y can. Thus, our hypotheses concern the slope of the line, which is estimated in the prediction equation by b. Specifically, we want to test that the slope is not zero:

H₀: There is no explanatory relation between our variables, H0: ß = 0

HA: There is an explanatory relation between our variables, HA: ß ≠ 0

or if directional – specify direction for relation (positive or negative), HA: ß > 0, HA: ß < 0

A non-zero slope indicates that we can explain values in Y based on X and therefore predict future values of Y based on X. Our alternative hypotheses are analogous to those in correlation: positive relations have values above zero, negative relations have values below zero, and two-tailed tests are possible. Just like ANOVA, we will test the significance of this relation using the F statistic calculated in our ANOVA table compared to a critical value from the F distribution table. Let’s take a look at an example and regression in action.

Example: Happiness and Well-Being

Researchers are interested in explaining differences in how happy people are based on how healthy people are. They gather data on each of these variables from 18 people and fit a linear regression model to explain the variance. We will follow the four-step hypothesis testing procedure to see if there is a relation between these variables that is statistically significant.

Step 1: State the Hypotheses

The null hypothesis in regression states that there is no relation between our variables. The alternative states that there is a relation, but because our research description did not explicitly state a direction of the relation, we will use a non- directional hypothesis.

H₀: There is no explanatory relation between health and happiness, H0: ß = 0

HA: There is an explanatory relation between health and happiness, HA: ß ≠ 0

Step 2: Find the Critical Value

Because regression and ANOVA are the same analysis, our critical value for regression will come from the same place: the F distribution table, which uses two types of degrees of freedom. We saw above that the degrees of freedom for our numerator – the Model line – is always 1 in simple linear regression, and that the denominator degrees of freedom – from the Error line – is N – 2. In this instance, we have 18 people so our degrees of freedom for the denominator is 16. Going to our F table, we find that the appropriate critical value for 1 and 16 degrees of freedom is F* = 4.49, shown below in figure 3.

Figure 3. Critical value from F distribution table

Step 3: Calculate the Test Statistic

The process of calculating the test statistic for regression first involves computing the parameter estimates for the line of best fit. To do this, we first calculate the means, standard deviations, and sum of products for our X and Y variables, as shown below.

X	(X − ̅X)	(X − ̅X)²	Y	(Y − ̅Y)	(Y − ̅Y)²	(X − ̅X)(Y − ̅Y)
17.65	-2.13	4.53	10.36	-7.10	50.37	15.10
16.99	-2.79	7.80	16.38	-1.08	1.16	3.01
18.30	-1.48	2.18	15.23	-2.23	4.97	3.29
18.28	-1.50	2.25	14.26	-3.19	10.18	4.79
21.89	2.11	4.47	17.71	0.26	0.07	0.55
22.61	2.83	8.01	16.47	-0.98	0.97	-2.79
17.42	-2.36	5.57	16.89	-0.56	0.32	1.33
20.35	0.57	0.32	18.74	1.29	1.66	0.73
18.89	-0.89	0.79	21.96	4.50	20.26	-4.00
18.63	-1.15	1.32	17.57	0.11	0.01	-0.13
19.67	-0.11	0.01	18.12	0.66	0.44	-0.08
18.39	-1.39	1.94	12.08	-5.37	28.87	7.48
22.48	2.71	7.32	17.11	-0.34	0.12	-0.93
23.25	3.47	12.07	21.66	4.21	17.73	14.63
19.91	0.13	0.02	17.86	0.40	0.16	0.05
18.21	-1.57	2.45	18.49	1.03	1.07	-1.62
23.65	3.87	14.99	22.13	4.67	21.82	18.08
19.45	-0.33	0.11	21.17	3.72	13.82	-1.22
totals/∑
356.02	0.00	76.14	314.18	0.00	173.99	58.29

From the raw data in our X and Y columns, we find that the means are ̅X = 19.78 and ̅Y = 17.45. The deviation scores for each variable sum to zero, so all is well there. The sums of squares for X and Y ultimately lead us to standard deviations of Sx = 2.12 and Sy = 3.20. Finally, our sum of products is 58.29, which gives us a covariance of cov_XY = 3.43, so we know our relation will be positive. This is all the information we need for our equations for the line of best.

First, we must calculate the slope of the line:

b = SSx/SP = 58.29/76.14 = 0.77

This means that as X changes by 1 unit, Y will change by 0.77. In terms of our problem, as health increases by 1, happiness goes up by 0.77, which is a positive relation. Next, we use the slope, along with the means of each variable, to compute the intercept:

a = ̅Y − b̅X = 17.45 − (0.77 ∗ 19.78) = 17.45 − 15.03 = 2.42

For this particular problem (and most regressions), the intercept is not an important or interpretable value, so we will not read into it further.

Now that we have all of our parameters estimated, we can give the full equation for our line of best fit:

Ŷ = 2.42 + 0.77X

We can plot this relation in a scatterplot and overlay our line onto it, as shown in figure 4.

Figure 4. Health and happiness data and line.

We can use the line equation to find predicted values for each observation and use them to calculate our sums of squares model and error, but this is tedious to do by hand, so we will let the computer software do the heavy lifting in that column of our ANOVA table:

Source	SS	df	MS	F
Model	44.62
Error	129.37
Total

Now that we have these, we can fill in the rest of the ANOVA table. We already found our degrees of freedom in Step 2:

Source	SS	df	MS	F
Model	44.62	1
Error	129.37	16
Total

Our total line is always the sum of the other two lines, giving us:

Source	SS	df	MS	F
Model	44.62	1
Error	129.37	16
Total	173.99	17

Our mean squares column is only calculated for the model and error lines and is always our SS divided by our df, which is:

Source	SS	df	MS	F
Model	44.62	1	44.62
Error	129.37	16	8.09
Total	173.99	17

Finally, our F statistic is the ratio of the mean squares:

Source	SS	df	MS	F
Model	44.62	1	44.62	5.52
Error	129.37	16	8.09
Total	173.99	17

This gives us an obtained F statistic of 5.52, which we will now use to test our hypothesis.

Step 4: Make the Decision

We now have everything we need to make our final decision. Our obtained test statistic was F = 5.52 and our critical value was F* = 4.49. Since our obtained test statistic is greater than our critical value, we can reject the null hypothesis.

Reject H₀. Based on our sample of 18 people, we can predict levels of happiness based on how healthy someone is, F(1,16) = 5.52, p < .05.

Effect Size

We know that, because we rejected the null hypothesis, we should calculate an effect size. In regression, our effect size is variance explained, just like it was in ANOVA. Instead of using η² to represent this, we instead us R², as we saw in correlation (yet more evidence that all of these are the same analysis).

From the example above, we get R² = .26. We are explaining 26% of the variance in happiness based on health, which is a large effect size (R² uses the same effect size cutoffs as η²).

Accuracy in Prediction

We found a large, statistically significant relation between our variables, which is what we hoped for. However, if we want to use our estimated line of best fit for future prediction, we will also want to know how precise or accurate our predicted values are. What we want to know is the average distance from our predictions to our actual observed values, or the average size of the residual (Y − Ŷ). The average size of the residual is known by a specific name: the standard error of the estimate s(Y− Ŷ). The formula is almost identical to our standard deviation formula, and it follows the same logic. For our example, s(Y− Ŷ) = 2.84. So on average, our predictions are just under 3 points away from our actual values. There are no specific cutoffs or guidelines for how big our standard error of the estimate can or should be; it is highly dependent on both our sample size and the scale of our original Y variable, so expert judgment should be used. In this case, the estimate is not that far off and can be considered reasonably precise.

Quick recap of regression (without the math)

Two variables of regression

1. Predictor (X)

2. Criterion (Y)

With correlation it did not matter which variable was the predictor variable or the criterion variable. But with prediction we have to decide which variable is being predicted from and which variable is being predicted. The variable being predicted from is called the predictor variable. The variable being predicted is called the criterion variable. In equations, the predictor variable is usually labeled X, and the criterion is labeled Y.

The Linear Prediction Rule: Ideally we want to make a prediction rule that is both simple and depends on every case for each prediction. In a linear prediction rule the formal name for the baseline number is the regression constant or just constant. It has the name constant because it is a fixed value that we always add in to the prediction.

The number we multiplied by the person’s score on the predictor variable, b, is called the regression coefficient because a “coefficient” is a number we multiply by something.

Let’s revisit example 2, predicting college GPA from SAT scores. For our SAT and GPA example, the rule might be “to predict a person’s graduating GPA, start with .3 and at the result of multiplying .004 by the person’s SAT scores”. So, the baseline number (a) would be .3 and the predictor value (b) is .004. If a person had an SAT of 600 we would predict the person would graduate with a GPA of 2.7. This idea is known as the linear prediction rule. Lows go with lows and highs with highs, or lows with highs and highs with lows.

Criterion Variable (Ŷ)

The variable we are predicting in a regression equation is called the criterion variable. It is labeled as Ŷ. The mark above Y indicates that this variable is a predicted variable and is dependent on the value of X.

Slope of the Regression Line (b)

The steepness of the angle of the regression line, called its slope, is the amount that the line moves up for every unit it is moved across. In our SAT example the line moves up .004 on the GPA scale for every additional point on the SAT. In fact, the slope of the line is exactly b, the regression coefficient.

Intercept of the Regression Line (a)

The point at which the regression line crosses or intersects the vertical axis is called the intercept.

The intercept is the predicted score on the criterion variable when the score on the predictor variable is 0. It turns out that the intercept is the same as the regression constant.
The reason this works is the regression constant is the number we always add in – a kind of baseline number, the number we start with.
It is reasonable that the best baseline number would be the number we predict from a predictor score of 0.

In the SAT example the line crosses the vertical axis app .3. That is, when a person has an SAT score of zero, they are predicted to have a college GPA .3.

Linear regression standardized coefficient (β)

Standardized regression coefficient (β):

This formula has the effect of changing the regular (unstandardized) regression coefficient (b), to a standardized regression coefficient (β) that shows the relationship between the predictor and criterion variables in terms of standard deviation units.

Multiple Regression and Other Extensions

Simple linear regression as presented here is only a stepping stone towards an entire field of research and application. Regression is an incredibly flexible and powerful tool, and the extensions and variations on it are far beyond the scope of this chapter (indeed, even entire books struggle to accommodate all possible applications of the simple principles laid out here). The next step in regression is to study multiple regression, which uses multiple X variables as predictors for a single Y variable at the same time. The math of multiple regression is very complex but the logic is the same: we are trying to use variables that are statistically significantly related to our outcome to explain the variance we observe in that outcome. Other forms of regression include curvilinear models that can explain curves in the data rather than the straight lines used here, as well as moderation models that change the relation between two variables based on levels of a third. The possibilities are truly endless and offer a lifetime of discovery.

Learning Objectives

Having read this chapter, a student should be able to:

Explain the concept of a linear equation, including slope and intercept
Explain how regression is related to correlation and ANOVA
Understand the concept of least-square solution
Understand the concept of multiple regression

Exercises – Ch. 17

How are ANOVA and linear regression similar? How are they different?
What is a residual?
How are correlation and regression similar? How are they different?
What are the two parameters of the line of best fit, and what do they represent?
What is our criteria for finding the line of best fit?
Fill out the rest of the ANOVA tables below for simple linear regressions: a.

Source	SS	df	MS	F
Model	34.21	1	34.21
Error
Total	66.12	54

7. In chapter 15, we found a statistically significant correlation between overall performance in class and how much time someone studied. Use the summary statistics calculated in that problem (provided here) to compute a line of best fit predicting success from study times: ̅X = 1.61, s_X = 1.12, ̅Y = 2.95, s_Y = 0.99, r = 0.65.

8. Using the line of best fit equation created in problem 7, predict the scores for how successful people will be based on how much they study:

a. X = 1.20

b. X = 3.33

c. X = 0.71

d. X = 4.00

9. You have become suspicious that the draft rankings of your fantasy football league have no predictive value for how teams place at the end of the season. You go back to historical league data and find rankings of teams after the draft and at the end of the season (below) to test for a statistically significant predictive relation. Assume SSM = 2.65 and SSE = 337.35

Draft Projection	Final Rankings
1	14
2	6
3	8
4	13
5	2
6	15
7	4
8	10
9	11
10	16
11	9
12	7
13	14
14	12
15	1
16	5

10. You have summary data for two variables: how extroverted some is (X) and how often someone volunteers (Y). Using these values, calculate the line of best fit predicting volunteering from extroversion then test for a statistically significant relation using the hypothesis testing procedure: ̅X = 12.58, s_X =4.65, ̅Y = 7.44, s_Y = 2.12, r = 0.34, N = 67, SSM = 19.79, SSE = 215.77.

Answers to Odd- Numbered Exercises – Ch. 17

1. ANOVA and simple linear regression both take the total observed variance and partition it into pieces that we can explain and cannot explain and use the ratio of those pieces to test for significant relations. They are different in that ANOVA uses a categorical variable as a predictor whereas linear regression uses a continuous variable.

3. Correlation and regression both involve taking two continuous variables and finding a linear relation between them. Correlations find a standardized value describing the direction and magnitude of the relation whereas regression finds the line of best fit and uses it to partition and explain variance.

5. Least Squares Error Solution; the line that minimizes the total amount of residual error in the dataset.

7. b = r*(s_y/s_x) = 0.65*(0.99/1.12) = 0.72; a = ̅Y – b̅X = 2.95 – (0.72*1.61) =1.79; Ŷ = 1.79 + 0.72X

9. Step 1: H₀: β = 0 “There is no predictive relation between draft rankings and final rankings in fantasy football,” H_A: β ≠ 0, “There is a predictive relation between draft rankings and final rankings in fantasy football.”

Step 2: Our model will have 1 (based on the number of predictors) and 14 (based on how many observations we have) degrees of freedom, giving us a critical value of F* = 4.60.

Step 3: Using the sum of products table, we find : ̅X = 8.50, ̅Y = 8.50, SS_X = 339.86, SP = 29.99, giving us a line of best fit of: b = 29.99/339.86 = 0.09; a = 8.50 – 0.09*8.50 = 7.74; Ŷ = 7.74 + 0.09X.

Our given SS values and our df from step 2 allow us to fill in the ANOVA table:

Source	SS	df	MS	F
Model	2.65	1	2.65	0.11
Error	337.35	14	24.10
Total	339.86	15

Step 4: Our obtained value was smaller than our critical value, so we fail to reject the null hypothesis. There is no evidence to suggest that draft rankings have any predictive value for final fantasy football rankings, F(1,14) = 0.11, p > .05

License

Icon for the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License