The steps involved in conducting multiple regression analysis are similar to those for bivariate regression analysis. The discussion focuses on partial regression coefficients, strength of association, significance testing, and examination of residuals
Partial Regression Coefficients
To understand the meaning of a partial regression coefficient, let us consider a case in which there are two independent variables, so that:
First, note that the relative magnitude of the partial regression coefficient of an independent variable is’, in general, different from that of its bivariate regression coefficient. In other words, the partial regression coefficient, bl, will be different from the regression coefficient, b,obtained by regressing Yon only XI’ This happens because XI and X2 are usually correlated. In bivariate regression, X2 was not considered, and any variation in Y that was shared by XI and X2 was attributed to XI’ However, in the case of multiple independent variables, this is no longer true.
Conceptually, the relationship between the bivariate; regression coefficient and the partial regression coefficient can be illustrated as. follows. Suppose one were to remove the effect of X2 from XI’ This could be done by running a regression of XI on X2 In other words, one would estimate the equation XI = a + bX2 and calculate the residual X; = (XI – XI)’ The partial regression coefficient, bl, is then equal to the bivariate regression coefficient, b,. obtained from the equation Y = a + b, Xr• In other words, the partial regression coefficient, bl, is equal to the regression coefficient, b,. between Y and the residuals of XI from which the effect of X2 has .been removed. The partial coefficient, b2, can also be interpreted along similar lines.
The beta coefficients are the partial regression coefficients obtained when all the variables (Y. X”X2′” .Xk) have been standardized to a mean of 0 and a variance of 1 before estimating the regression equation. The relationship of the standardized to the non standardized coefficients remains the same as before:
The intercept and the partial regression coefficients are estimated by solving a system of simultaneous equations derived by differentiating and equating the partial derivatives to O.
Because these coefficients are automatically estimated by the various computer programs, we will not present the details. Yet it is worth noting that the equations cannot be solved if (I) the sample size, n, is smaller than or equal to the number of independent variables, k; or (2) one independent variable is-perfectly correlated with another.
Such that, in explaining the attitude toward the city, we now introduce a second variable, importance a~hed to the weather. The data for the 12 pretest respondents on attitude toward the city, duration of residence, and importance attached to the weather are given in Table 17.1. The results of multiple regression analysis are depicted in Table 173. ‘The partial regression coefficient for duration (XI) is now 0.48108, different from what it was in the bivariate case. ‘The corresponding beta coefficient is 0.7636. The partial regression coefficient for importance attached to weather (X2) is 0.28865, with a beta coefficient of 03138. The estimated regression equation is:
(y) = 033732 + 0.48108XI + 0.28865X2
Attitude = 033732 + 0.48108 (Duration) + 0.28865 (Importance)”
This equation can be used for a variety of purposes, including predicting attitudes toward the city, given a knowledge of the respondents’ duration of residence in the city and the importance they attach to weather. Note that both duration and importance are significant and useful in this prediction
Strength of Association
The strength of the relationship stipulated by the regression equation can be determined by using appropriate measures of association. The total variation is decomposed as in the bivariate case:
The strength of association is measured by the square of the multiple correlation coefficient, R2, which is also called the coefficient of multiple determination.
The multiple correlation coefficient, R, can also be viewed as the simple correlation coefficient, r, between Y and Y.Several points about the characteristics of R2 are worth noting. The coefficient of multiple determination, R2, cannot be less than the highest bivariate, r2, of any individual independent variable with the dependent variable. R2 will be larger when the correlations between the independent variables are low. If the independent variables are statistically independent (uncorrelated), then R2 will be the sum of bivariate ,2 of each independent variable with the dependent variable. R2 cannot decrease as more independent variables are added to the regression equation. Yet diminishing returns set in, so that after the first few variables, the additional independent variables do not make much of a contribution.l” For this reason, R2 is adjusted for the number of independent variables and the sample size by using the following formula:
Although high R2 and significant partial regression coefficients are comforting, the efficacy of the regression model should be evaluated further by an examination of the residuals
Examination of Residuals
A residual is the difference between the observed value of Yi and the value predicted by the regression equation. Yi. Residuals are used in the calculation of several statistics associated with regression. In addition, scattergrams of the residuals, in which the residuals are plotted against the predicted values, Y;. time, or predictor variables, provide useful insights in examining the appropriateness of the underlying assumptions and regression model fitted.’?
The assumption of a normally distributed error term can be examined by constructing a histogram of the standardized residuals. A visual check reveals whether the distribution is normal. It is also useful to examine the normal probability plot of standardized residuals. The normal probability plot shows the observed standardized residuals compared to expected standardized residuals from a normal distribution. If the observed residuals are normally distributed, they will fall on the 45-degree line. Also, look at the table of residual statistics and identify any standardized predicted values or standardized residuals that are more than plus or minus one and two standard deviations. These percentages can be compared with what would be expected under the normal distribution (68 percent and 95 percent, respectively). More formal assessment can be made by running the K-S one-sample test.
The assumption of constant variance of the error term can be examined by plotting the standardized residuals against the standardized predicted values of the dependent variable, Y If the pattern is not random, the variance of the error term is not constant. Figure 17.7 shows a pattern whose variance is dependent upon the Y values
A plot of residuals against time, or the sequence of observations, will throw some light on the assumption that the error terms are uncorrelated. A random pattern should be seen if this assumption is true. A plot like the one in Figure 17.8 indicates a linear relationship
between residuals and time. A more formal procedure for examining the correlations between the error terms is the Durbin-Watson test
Plotting the residuals against the independent variables provides evidence of the appropriateness or inappropriateness of using a linear model. Again, the plot should result in a random pattern. The residuals should fall randomly, with relatively equal distribution dispersion about O. They should not display any tendency to be either positive or negative.
The plots and the residual table can be requested when the regression is run, for example, when using SPSS. You should conduct these analyses for multiple regression on the data of Table 17.1. From the histogram, it can be seen that five residuals are positive, whereas seven residuals are negative. By comparing the frequency distribution with the normal distribution that is plotted in the same output, you can see that the assumption of normality is probably not met but that the departure from normality might not be severe. Of course, one can do a more formal statistical test for normality if that is warranted.
All the standardized residuals are within plus or minus two standard
deviations. Furthermore, many of the residuals are relatively small, which means that most of the model predictions are quite good.
The normal probability plot shows that the residuals are quite close to the 45-degree line shown in the graph. When you look at the plot of the standardized residuals against the standardized predicted values, no systematic pattern can be discerned in the spread of the residuals. Finally, the table of residual statistics indicates that all the standardized predicted values and all the standardized residuals are within plus or minus two standard deviations. Hence, we can conclude that multiple’ regression on the data of Table 17.1 does not appear to result in gross violations of the assumptions. This suggests that the relationship we are trying to predict is linear and that the error terms are more or less normally distributed.