regression analysis: simplify complex data relationships coursera weekly challenges 3 answers
Test your knowledge: Understand multiple linear regression
1. Fill in the blank: _____ is a technique that estimates the linear relationship between one continuous dependent variable and two or more independent variables.
- Singular curved regression
- Multiple curved regression
- Singular linear regression
- Multiple linear regression
2. What are ways to ethically communicate multiple regression results as clearly as possible? Select all that apply.
- Ordinary least squares
- One hot encoding
- Interaction terms
- Confidence band
Test your knowledge: Model assumptions revisited
3. Which of the following statements is true? Select all that apply.
- One hot encoding is a data transformation technique.
- One hot encoding is a categorical transformation technique.
- One hot encoding allows data professionals to turn several categorical variables into one binary variable.
- One hot encoding allows data professionals to turn one categorical variable into several binary variables.
4. What is the definition of the no multicollinearity assumption?
- No two independent variables can be highly correlated with each other.
- No observation in the dataset can be independent.
- No variation of the residential can be constant or similar across the model.
- No predictor variable can be linearly related to the outcome variable.
5. In what ways might a data professional handle data with multicollinearity? Select all that apply.
- Create new variables using existing data.
- Square the variables that have high multicollinearity.
- Turn one categorical variable into several binary variables.
- Drop one or more variables that have high multicollinearity.
Test your knowledge: Model interpretation
6. Fill in the blank: An interaction term represents how the relationship between two independent variables is associated with the changes in the _____ of the dependent variable.
- category
- multicollinearity
- rate of change
- mean
7. Which of the following relevant statistics can be found by using statsmodel’s OLS function? Select all that apply.
- P-values
- Variance inflation factors
- Coefficients
- Standard errors
Test your knowledge: Variable selection and model evaluation
8. Fill in the blank: Adjusted R squared is a variation of the R squared regression evaluation metric that _____ unnecessary explanatory variables.
- adds
- rewards
- eliminates
- penalizes
9. Which of the following statements accurately describe the differences between adjusted R squared and R squared? Select all that apply.
- Adjusted R squared is easily interpretable.
- R squared is more easily interpretable.
- R squared is used to compare models of varying complexity.
- Adjusted R squared is used to compare models of varying complexity.
10. What variable section process begins with the full model that has all possible independent variables?
- Backward elimination
- Forward selection
- F-test
- Extra-sum-of Squares
11. Which of the following are regularized regression techniques? Select all that apply.
- Lasso regression
- Ridge regression
- Elastic-net regression
- F-test regression
Weekly challenge 3
12. Multiple linear regression estimates the linear relationship between one continuous dependent variable and how many independent variables?
- One
- Two or more
- Zero
- Two
13. Fill in the blank: One hot encoding is a data transformation technique that turns one categorical variable into several _____ variables.
- dependent
- overfit
- binary
- independent
14. Fill in the blank: The no multicollinearity assumption states that no two _____ variables can be highly correlated with each other.
- independent
- dependent
- continuous
- categorical
15. A data professional creates a model that allows for flexibility and complexity, so it learns from existing data. What quality does this model have?
- Variance
- Bias
- Elimination
- Selection
16. What regularization technique completely removes variables that are less important to predicting the y variable of interest?
- Lasso regression
- Ridge regression
- Elastic net regression
- Independent regression
17. What technique estimates the linear relationship between one continuous dependent variable and two or more independent variables?
- Interaction terms
- Simple linear regression
- Multiple linear regression
- One hot encoding
18. A data professional confirms that no two independent variables are highly correlated with each other. Which assumption are they testing for?
- No multicollinearity assumption
- No linearity assumption
- No normality assumption
- No homoscedasticity assumption
19. Which of the following statements accurately describe forward selection and backward elimination? Select all that apply.
- Backward selection begins with the full model with all possible independent variables.
- Forward selection begins with the full model with all possible dependent variables.
- Forward selection begins with the full model with all possible independent variables.
- Forward selection begins with the null model and zero independent variables.
20. A data professional uses a regression technique that estimates the linear relationship between one continuous dependent variable and two or more independent variables. What technique are they using?
- Coefficient regression
- Multiple linear regression
- Simple linear regression
- Interaction regression
21. Which of the following is true regarding variance inflation factors? Select all that apply.
- The larger the variable inflation factor, the more multicollinearity in the model.
- The minimum value is 0.
- The minimum value is 1.
- The larger the variable inflation factor, the less multicollinearity in the model.
22. A data professional reviews model predictions. During the review, they notice a model that oversimplifies the relationship and underfits the observed data, which generates inaccurate estimates. What quality does this model have?
- Variance
- Elimination
- Bias
- Selection
23. What term represents the relationship for how two variables’ values affect each other?
- Feature selection term
- Linearity term
- Underfitting term
- Interaction term
24. Which of the following statements accurately describe adjusted R squared? Select all that apply.
- It is a regression evaluation metric.
- It penalizes unnecessary explanatory variables.
- It is greater than 1.
- It can vary from 0 to 1.
25. What regularization technique is recommended when working with large datasets and when there is uncertainty as to whether variables should drop out of the model?
- Backward regression
- Elastic net regression
- Ridge regression
- Lasso regression