Solving the regression equation

We have seen from the previous page that the values of y will not occur on the regression line. The general regression equation can be written as yl = a + b x. In order to predict a variable, we need to find the mean, variance, deviation and standard deviation of the values of x and y. The mean, variance, deviation and standard deviation have been explained in data analysis.

A table has been drawn below indicating the values of the mean, variance, deviation and standard deviation.


Estimated cost of previous contracts
Number of employees, x y
Total cost of contract
(in $ 1000's)
Deviation from mean
y - mean of y
Square of
deviation
(y - mean of y)2
Regression Line
yl
Deviation from
Regression line
y - yl
10 13 5.5 30.25 12.97 0.03
9 11 3.5 12.25 11.752 - 0.752
8 12 4.5 20.25 10.534 1.466
7 10 2.5 6.25 9.316 0.684
6 8 0.5 0.25 8.098 - 0.098
5 6 -1.5 2.25 6.88 - 0.88
4 5 -2.5 6.25 5.662 - 0.662
3 3 -4.5 20.25 4.444 - 1.44
2 4 -3.5 12.25 3.226 0.774
1 3 -4.5 20.25 2.008 0.992
Average = 5.5 Average = 7.5
Mean
Average = 0 Average = 13.05 Average = 7.5 Average = 0

Do you see that the sum of all the deviations of a variable from its mean is 0?

What are the standard deviation and variance of y in the above example ?



The constants a and b in the regression equation are called the regression coefficients. The value of the constants a and b in the regression equation can be found out from the following two equations :



When the values of a and b are found, the regression equation can be written using these values. The regression line is the equation of the line of best fit for the data available to us. In other words, the error, which is the vertical distance of each of the points from the regression line, is the smallest using this line.

What are the values of the regression coefficients a and b in this example?

What is the linear regression equation in the above example ?