regression analysis: simplify complex data relationships coursera weekly challenges 3 answers

Test your knowledge: Understand multiple linear regression

1. Fill in the blank: _____ is a technique that estimates the linear relationship between one continuous dependent variable and two or more independent variables.

  • Singular curved regression
  • Multiple curved regression
  • Singular linear regression
  • Multiple linear regression

2. What are ways to ethically communicate multiple regression results as clearly as possible? Select all that apply.

  • Ordinary least squares
  • One hot encoding
  • Interaction terms
  • Confidence band

Test your knowledge: Model assumptions revisited

3. Which of the following statements is true? Select all that apply.

  • One hot encoding is a data transformation technique.
  • One hot encoding is a categorical transformation technique.
  • One hot encoding allows data professionals to turn several categorical variables into one binary variable.
  • One hot encoding allows data professionals to turn one categorical variable into several binary variables.

4. What is the definition of the no multicollinearity assumption?

  • No two independent variables can be highly correlated with each other.
  • No observation in the dataset can be independent.
  • No variation of the residential can be constant or similar across the model.
  • No predictor variable can be linearly related to the outcome variable.

5. In what ways might a data professional handle data with multicollinearity? Select all that apply.

  • Create new variables using existing data.
  • Square the variables that have high multicollinearity.
  • Turn one categorical variable into several binary variables.
  • Drop one or more variables that have high multicollinearity.

Test your knowledge: Model interpretation

6. Fill in the blank: An interaction term represents how the relationship between two independent variables is associated with the changes in the _____ of the dependent variable.

  • category
  • multicollinearity
  • rate of change
  • mean

7. Which of the following relevant statistics can be found by using statsmodel’s OLS function? Select all that apply.

  • P-values
  • Variance inflation factors
  • Coefficients
  • Standard errors

Test your knowledge: Variable selection and model evaluation

8. Fill in the blank: Adjusted R squared is a variation of the R squared regression evaluation metric that _____ unnecessary explanatory variables.

  • adds
  • rewards
  • eliminates
  • penalizes

9. Which of the following statements accurately describe the differences between adjusted R squared and R squared? Select all that apply.

  • Adjusted R squared is easily interpretable.
  • R squared is more easily interpretable.
  • R squared is used to compare models of varying complexity.
  • Adjusted R squared is used to compare models of varying complexity.

10. What variable section process begins with the full model that has all possible independent variables?

  • Backward elimination
  • Forward selection
  • F-test
  • Extra-sum-of Squares

11. Which of the following are regularized regression techniques? Select all that apply.

  • Lasso regression
  • Ridge regression
  • Elastic-net regression
  • F-test regression

Weekly challenge 3

12. Multiple linear regression estimates the linear relationship between one continuous dependent variable and how many independent variables?

  • One
  • Two or more
  • Zero
  • Two

13. Fill in the blank: One hot encoding is a data transformation technique that turns one categorical variable into several _____ variables.

  • dependent
  • overfit
  • binary
  • independent

14. Fill in the blank: The no multicollinearity assumption states that no two _____ variables can be highly correlated with each other.

  • independent
  • dependent
  • continuous
  • categorical

18. A data professional creates a model that allows for flexibility and complexity, so it learns from existing data. What quality does this model have?

  • Variance
  • Bias
  • Elimination
  • Selection

19. What regularization technique completely removes variables that are less important to predicting the y variable of interest?

  • Lasso regression
  • Ridge regression
  • Elastic net regression
  • Independent regression

20. What technique estimates the linear relationship between one continuous dependent variable and two or more independent variables?

  • Interaction terms
  • Simple linear regression
  • Multiple linear regression
  • One hot encoding

21. A data professional confirms that no two independent variables are highly correlated with each other. Which assumption are they testing for?

  • No multicollinearity assumption
  • No linearity assumption
  • No normality assumption
  • No homoscedasticity assumption

22. Which of the following statements accurately describe forward selection and backward elimination? Select all that apply.

  • Backward selection begins with the full model with all possible independent variables.
  • Forward selection begins with the full model with all possible dependent variables.
  • Forward selection begins with the full model with all possible independent variables.
  • Forward selection begins with the null model and zero independent variables.

23. A data professional uses a regression technique that estimates the linear relationship between one continuous dependent variable and two or more independent variables. What technique are they using?

  • Coefficient regression
  • Multiple linear regression
  • Simple linear regression
  • Interaction regression

24. Which of the following is true regarding variance inflation factors? Select all that apply.

  • The larger the variable inflation factor, the more multicollinearity in the model.
  • The minimum value is 0.
  • The minimum value is 1.
  • The larger the variable inflation factor, the less multicollinearity in the model.

25. A data professional reviews model predictions. During the review, they notice a model that oversimplifies the relationship and underfits the observed data, which generates inaccurate estimates. What quality does this model have?

  • Variance
  • Elimination
  • Bias
  • Selection

26. What term represents the relationship for how two variables’ values affect each other?

  • Feature selection term
  • Linearity term
  • Underfitting term
  • Interaction term

27. Which of the following statements accurately describe adjusted R squared? Select all that apply.

  • It is a regression evaluation metric.
  • It penalizes unnecessary explanatory variables.
  • It is greater than 1.
  • It can vary from 0 to 1.

28. What regularization technique is recommended when working with large datasets and when there is uncertainty as to whether variables should drop out of the model?

  • Backward regression
  • Elastic net regression
  • Ridge regression
  • Lasso regression

Leave a Reply