data analysis with r coursera week 4 quiz answers
Practice Quiz
1. What are the key reasons to develop a model for your data analysis? Select three answers.
- Understand how the data were generated.
- Determine the relationships between variables.
- Identify any special structures that may exist in the data.
- Determine the accuracy of your data.
2. There are four assumptions associated with a linear regression model. What is the definition of the assumption homoscedasticity?
- Observations are independent of each other.
- For any fixed value of X, Y is normally distributed.
- The variance of residual is the same for any value of X.
- The relationship between X and the mean of Y is linear.
3. What step must you take before you can obtain a prediction based on a fitted simple linear regression model?
- Do nothing. Once you have a fitted simple linear regression model, you have all you need to make predictions.
- Use or create a data frame containing known target variables.
- Use or create a data frame containing never seen data.
- Use or create a data frame containing known predictor variables.
4. Assume you have a dataset called “new_dataset”, two predictor variables called X and Y, and a target variable called Z, and you want to fit a multiple linear regression model. Which command should you use?
- linear_model <- lm(X + Y ~ Z, data = new_dataset)
- linear_model <- lm(Z ~ X ~ Y, data = new_dataset)
- linear_model <- lm(Z ~ X + Y, data = new_dataset)
- linear_model <- lm(X + Y + Z, data = new_dataset)
5. Which plot types help you validate assumptions about linearity? Select two answers.
- Residual plot
- Q-Q plot
- Scale-location plot
- Regression plot
6. True or False: When using the poly() function to fit a polynomial regression model, you must specify “raw = FALSE” so you can get the expected coefficients.
- True.
- False.
7. Which performance metric for regression is the mean of the square of the residuals (error)?
- Root mean squared error (RMSE)
- Mean squared error (MSE)
- Mean absolute error (MAE)
- R-squared (R2)
8. When comparing the MSE of different models, do you want the highest or lowest value of MSE?
- Highest value of MSE
- Lowest value of MSE
Graded Quiz
9. In model development, you can develop more accurate models when you have which of the following?
- More dependent variables.
- Relevant data.
- Fewer independent variables.
- Larger quantities of data.
10. Assume you have a dataset called “new_dataset”, a predictor variable called X, and a target called Y, and you want to fit a simple linear regression model. Which command should you use?
- linear_model <- lm(X ~ Y, data = new_dataset)
- linear_model <- lm(Y ~ X, data = new_dataset)
- linear_model <- predict(Y ~ Z, data = new_dataset)
- linear_model <- predict(X ~ Y, data = new_dataset)
11. When using the predict() function in R, what is the default confidence level?
- 100%
- 85%
- 95%
- 90%
12. Which plot type helps you validate assumptions about normality?
- Q-Q plot
- Regression plots
- Scale-location plot
- Residual plot
13. A third order polynomial regression model is described as which of the following?
- Cubic, meaning that the predictor variable in the model is cubed.
- Squared, meaning that the predictor variable in the model is squared.
- Quadratic, meaning that the predictor variable in the model is squared.
- Simple linear regression.
14. How should you interpret an R-squared result of 0.89?
- There is a strong negative correlation between the variables.
- 89% of the response variable variation is explained by a linear model.
- The X variable causes the Y variable to positively change 89% of the time.
- 89% of the response variable variation is explained by a polynomial model.
15. When comparing linear regression models, when will the mean squared error (MSE) be smaller?
- This depends on your data. The model that fits the data better has the smaller MSE.
- When using a multiple linear regression (MLR) model.
- When using a polynomial regression model.
- When using a simple linear regression (SLR) model.