View Single Post
  #10  
Old November 9th, 2009, 07:35 AM
theodds theodds is offline
Junior Member
 
Join Date: Oct 2009
Posts: 72
Thanks: 2
Thanked 25 Times in 24 Posts
theodds is on a distinguished road
Default

Quote:
Originally Posted by ontherocks View Post
Ok great.
Now my next questions.

Q1. Which term in the model (for example in a multiple linear regression)
y = B_0 + B_1 x_1 + B_2 x_2 + ... +B_n x_n
gives the information about which variable is dominant and which is not?
I think the parameters (I mean the magnitude of the parameters) tell if the corresponding variable is dominant or not, am I right?
None of the terms in the model do. The estimates of the parameters don't give any indication as to which predictor is "dominant." This can be seen if you notice that the X's may be scaled differently. If you standardized the predictors, then you would have a better case for interpreting the Beta's in this way. The first step is usually figuring out which predictors are statistically significant though.

Quote:
Q2. Again a quote from wikipedia (Coefficient of determination - Wikipedia, the free encyclopedia)
"In many (but not all) instances where R^2 is used, the predictors are calculated by ordinary least-squares regression: that is, by minimizing SSerr. In this case R-squared increases as we increase the number of variables in the model (R^2 will not decrease)."
Could you explain why R^2 increases as the number of variables is increased and vice versa?
Before thinking about R^2, you should probably first learn about the concept of partitioning the overall variability of the response. This material always comes before discussing R^2 when learning regression. See ANOVA, particularly the section on partitioning sums of squares. Then, just know that in Regression we partition the total variability as SS(Error) and SS(Regression) instead, and we have the formula R^2 = \frac{SS(Regression)}{SS(Total)}. The intuitive reason why R^2 strictly increases is that, as we add more predictors, we can only ADD predictive power to the model. After all, the worst case scenario is that we've added a predictor that is unrelated, and in that case we should end up with the same model effectively anyways. The flip side is that having a lot of unnecessary predictors creates a host of problems relating to lack of parsimony and bias (among other things), so R^2 is a pretty awful criteria for determining how many predictors to have in your model.
Reply With Quote