the nuts and bolts of machine learning coursera week 2 quiz answers
Test your knowledge: PACE in machine learning: The plan and analyze stages
1. Fill in the blank: Feature engineering enables data professionals to take _____ and extract features from it.
- delimited text
- a dynamic dashboard
- a code chunk
- raw data
2. What term describes the process of modifying existing features in a way that improves accuracy when training a model?
- Feature selection
- Feature extraction
- Feature improvement
- Feature transformation
3. A class imbalance occurs when a dataset has a predictor variable that contains an equal number of instances of all possible outcomes.
- True
- False
Test your knowledge: PACE in machine learning: The construct and execute stages
4. Fill in the blank: Posterior probability is the probability of an event occurring after considering _____ information.
- historical
- undefined
- conditional
- new
5. A data professional would use the function MinMaxScaler to normalize the columns in a model so that each value falls between zero and one.
- True
- False
6. A data professional has built a model, and now they are adjusting how features are engineered in order to improve performance. Which PACE stage does this scenario describe?
- Plan
- Execute
- Analyze
- Construct
Weekly challenge 2
7. Which of the following statements accurately describe the general categories of feature engineering? Select all that apply.
- Feature transformation involves modifying existing features in a way that improves accuracy when training a model.
- Feature extraction involves choosing the features in the data that contribute the most to predicting the response variable.
- The three general categories of feature engineering are selection, extraction, and transformation.
- Feature selection involves taking multiple features to create a new one that will improve the accuracy of the algorithm.
8. Which of the following datasets contains a class imbalance that will likely create a problem during analysis?
- A dataset whose majority class comprises 70% of the data and minority class comprises 30% of the data
- A dataset whose majority class comprises 90% of the data and minority class comprises 10% of the data
- A dataset whose classes are split equally, each comprising 50% of the data
- A dataset whose majority class comprises 60% of the data and minority class comprises 40% of the data
9. Fill in the blank: Customer churn is a business term that describes how many customers stop _____ and at what rate this occurs.
- doing business with a company
- writing positive reviews about a company
- returning items to a company
- contacting a company’s customer relations department
10. Naive Bayes’s theorem enables data professionals to calculate posterior probability for a data project. What does posterior probability describe?
- The likelihood of an event occurring after taking into consideration only the most suitable observations and information
- The likelihood of an event occurring after taking into consideration all new, relevant observations and information
- The likelihood of an event occurring based upon the observations and information that were available at the start of the data project
- The likelihood of an event occurring based upon only observations and information that align with current hypotheses
11. Fill in the blank: When normalizing the columns in a dataset using MinMaxScaler, the columns’ maximum value scales to one, and the minimum value scales to _____. Everything else falls somewhere in between.
- 0.1
- .5
- -1
- 0
12. A data professional is assessing the business need in order to determine what type of model is best suited to a project. Which PACE stage does this scenario describe?
- Execute
- Construct
- Analyze
- Plan
13. In the model-development process, which type of feature does not contain any useful information for predicting the target variable?
- Relevant
- Predictive
- Irrelevant
- Conducive
14. Fill in the blank: Log normalization is useful when working with a model that cannot manage continuous variables with _____ distributions.
- normal
- skewed
- probability
- binomial
15. What occurs when a dataset has a predictor variable that contains more instances of one outcome than another?
- Incompatibility
- Class imbalance
- Redundancy
- Inconsistent data
16. Fill in the blank: Customer churn is the business term that describes how many customers stop _____ and at what rate this occurs.
- using a product or service
- sharing feedback with a company
- researching a company’s offerings
- reviewing items online
17. Naive Bayes is a supervised classification technique that assumes independence among predictors. What is the meaning of this concept?
- The value of a predictor variable on a given class is not affected by the values of other predictors.
- The value of a predictor variable on a given class is equal to the values of other predictors.
- The value of a predictor variable on a given class is measured by the values of other predictors.
- The value of a predictor variable on a given class is dependent upon the values of other predictors.
18. Which of the following statements accurately describe feature engineering? Select all that apply.
- Feature engineering does not involve using a data professional’s statistical knowledge.
- In feature engineering, feature extraction involves taking multiple features to create a new one that will improve the accuracy of the algorithm.
- In feature engineering, feature selection involves choosing the features in the data that contribute the most to predicting the response variable.
- Feature engineering may involve transforming the properties of raw data.
19. What does Bayes’s theorem enable data professionals to calculate?
- Margin of error
- Data accuracy
- Posterior probability
- Causation
20. Fill in the blank: When using a scaler to _____ the columns in a dataset using MinMaxScaler, a data professional must fit the scaler to the training data and transform both the training data and the test data using that same scaler.
- filter
- customize
- sort
- normalize
21. A data professional is evaluating a model’s performance and considering how it can be improved. Which PACE stage does this scenario describe?
- Plan
- Construct
- Analyze
- Execute
22. In the model-development process, which type of feature is not useful by itself for predicting the target variable, but becomes predictive in conjunction with other features?
- Predictive
- Interactive
- Redundant
- Irrelevant