machine learning with python ibm coursera quiz answers week 3

Practice Quiz: Classification

1. Which one is TRUE about the kNN algorithm?

  • kNN calculates similarity by measuring how close the two data points’ response values are.
  • The most similar point in kNN is the one with the smallest distance averaged across all normalized features.
  • kNN is a classification algorithm that takes a bunch of unlabelled points and uses them to learn how to label other points.
  • kNN algorithm can be used to estimate values for a continuous target.

2. If the information gain of the tree by using attribute A is 0.3, what can we infer?

  • The entropy of a tree before split minus weighted entropy after split by attribute A is 0.3.
  • Entropy in the decision tree increases by 0.3 if we make this split.
  • Compared to attribute B with 0.65 information gain, attribute A should be selected first for splitting.
  • By making this split, we increase the randomness in each child node by 0.3.

3. When we have a value of K for KNN that’s too small, what will the model most likely look like?

  • The model will have high out-of-sample accuracy.
  • The model will be highly complex and captures too much noise.
  • The model will have high accuracy on the test set.
  • The model will be overly simple and does not capture enough noise.

Graded Quiz: Classification

4. What can we infer about our kNN model when the value of K is too big?

  • The model will be too complex and not interpretable.
  • The training accuracy will be high, while the out-of-sample accuracy will be low.
  • The model is overly generalized and underfitted to the data.
  • The model will capture a lot of noise as a result of overfitting.

5. When splitting data into branches for a decision tree, what kind of feature is favored and chosen first?

  • The feature that splits the data equally into groups.
  • The feature with the greatest number of categories.
  • The feature that increases entropy in the tree nodes.
  • The feature that increases purity in the tree nodes.

6. What is the relationship between entropy and information gain?

  • High entropy and high information gain is desired.
  • When information gain decreases, entropy decreases.
  • High entropy and low information gain is desired.
  • When information gain increases, entropy decreases.

7. Predicting whether a customer responds to a particular advertising campaign or not is an example of what?

  • Classification problem
  • Machine learning
  • Regression
  • None of the above

8. For a new observation, how do we predict its response value (categorical) using a KNN model with k=5?

  • Take the majority vote among 5 points who are the most similar to each other.
  • Form 5 clusters and assign the new observation to the most similar cluster, taking the mean value as prediction.
  • Take majority vote among 5 points whose features are closest to the new observation.
  • Take the average among 5 points whose features are closest to the new observation.

Leave a Reply