We have seen from the previous page that the values of y will not occur on the regression line. The general regression equation can be written as yl = a + b x. In order to predict a variable, we need to find the mean, variance, deviation and standard deviation of the values of x and y. The mean, variance, deviation and standard deviation have been explained in data analysis.
A table has been drawn below indicating the values of the mean, variance, deviation and standard deviation.
| Number of employees, x | y Total cost of contract (in $ 1000's) |
Deviation from mean y - mean of y |
Square of deviation (y - mean of y)2 |
Regression Line yl |
Deviation from Regression line y - yl |
|---|---|---|---|---|---|
| 10 | 13 | 5.5 | 30.25 | 12.97 | 0.03 |
| 9 | 11 | 3.5 | 12.25 | 11.752 | - 0.752 |
| 8 | 12 | 4.5 | 20.25 | 10.534 | 1.466 |
| 7 | 10 | 2.5 | 6.25 | 9.316 | 0.684 |
| 6 | 8 | 0.5 | 0.25 | 8.098 | - 0.098 |
| 5 | 6 | -1.5 | 2.25 | 6.88 | - 0.88 |
| 4 | 5 | -2.5 | 6.25 | 5.662 | - 0.662 |
| 3 | 3 | -4.5 | 20.25 | 4.444 | - 1.44 |
| 2 | 4 | -3.5 | 12.25 | 3.226 | 0.774 |
| 1 | 3 | -4.5 | 20.25 | 2.008 | 0.992 |
| Average = 5.5 | Average = 7.5 Mean |
Average = 0 | Average = 13.05 | Average = 7.5 | Average = 0 |
Do you see that the sum of all the deviations of a variable from its mean is 0?
![]() |
The constants a and b in the regression equation are called the regression coefficients. The value of the constants a and b in the regression equation can be found out from the following two equations :


When the values of a and b are found, the regression equation can be written using these values. The regression line is the equation of the line of best fit for the data available to us. In other words, the error, which is the vertical distance of each of the points from the regression line, is the smallest using this line.