# Statistics

use data (attached) to solve the following questions using Excel

(1) Eliminate duplexes and properties with prices over \$850,000 from the data. Eliminate non- numeric variables and redundant variables from the data.

(2) Which variable correlates most strongly with price?

(3) Find the regression line Y = β0 + β1x with the variable chosen in the previous problem. [The lm function in R or the Analysis ToolPak add-in for Excel will do ]

For the remaining problems, consider the following variables associated with each property.

x1 = number of bedrooms x2 = number of bathrooms x3 = number of stories

x4 = square footage

x5 = house has pool?

(4) Construct the multivariable least squares model with predictors x1, x2, x3, x4, x5. [First, con- vert x5 to binary.]

(5) Use a hypothesis test to determine if the model is useful for predicting home values at a level α. State the p-value and interpret.

(6) Are any variables not useful predictors of home price at significance level α = 0.05? State the p-values of any rejected variables. What does this mean practically?

[promo2]