problematic observations
Please attach all the code from R and the output you get. Justify you findings at every step.
Run an OLS regression with price as your dependent variable (using the first 16
regressors).
Include your summary output and the standardized residual-leverage plot. Do you see
any
“problematic” observations? What would that mean? Manually standardize the residuals
and calculate how many standard deviations each is from the residual mean. Which
observations have residuals that are more than 4 standard deviations from the mean?
Describe those observations and whether or not you see implausible values.
After removing any observations for which you DID find implausible values in (1) run the
regression again. Use a Q-Q plot and a Jarque-Bera test to determine whether
residuals appear to be normally distributed. Use a RESET test to determine whether a
linear functional form appears to be appropriate.
Which variables do you think might have nonlinear relationships with price? Why?
Which
variables might require interaction terms? Why?
Use a series of F tests to determine the best polynomial structure to use and whether or
not
to include the interaction terms that you suggest in (3). Run an OLS regression with
your new, nonlinear model and include your summary output. Run the Jarque-Bera and
RESET tests again, does the result change? Do you see evidence of misspecification?