11. In a random forest, what type of data is used to train the ensemble of decision-tree base learners?
- Duplicated
- Unstructured
- Bootstrapped
- Sampled
12. Fill in the blank: When using a decision tree model, a data professional can use _____ to control the threshold below which nodes become leaves.
- min_samples_leaf
- max_features
- max_depth
- min_samples_split
Test your knowledge: Boosting
13. Fill in the blank: The supervised learning technique boosting builds an ensemble of weak learners _____, then aggregates their predictions.
- in parallel
- repeatedly
- randomly
- sequentially
14. When using a gradient boosting machine (GBM) modeling technique, which term describes a model’s ability to predict new values that fall outside of the range of values in the training data?
- Learning rate
- Cross validation
- Grid search
- Extrapolation
15. When using the hyperparameter min_child_weight, a tree will not split a node if it results in any child node with less weight than what is specified. What happens to the node instead?
- It becomes a root.
- It becomes a leaf
- It gets deleted.
- It duplicates itself to become another node.
Weekly challenge 4
16. Fill in the blank: In tree-based learning, a decision tree’s _____ represent observations about an item.
- roots
- splits
- leaves
- branches
17. Which of the following statements accurately describe decision trees? Select all that apply.
- Decision trees represent solutions to solve a given problem based on possible outcomes of related choices.
- Decision trees are susceptible to overfitting.
- Decision trees are equally effective at predicting both existing and new data.
- Decision trees require no assumptions regarding the distribution of underlying data.
18. Which section of a decision tree is where the final prediction is made?
- Decision node
- Root node
- Leaf node
- Split
Shuffle Q/A 2
19. In a decision tree model, which hyperparameter specifies the number of attributes that each tree selects randomly from the training data to determine its splits?
- Max depth
- Learning rate
- Number of estimators
- Max features
20. What process uses different portions of the data to test and train a model across several iterations?
- Grid search
- Cross validation
- Model validation
- Proportional verification