the nuts and bolts of machine learning coursera week 3 quiz answers

Test your knowledge: Explore unsupervised learning and K-means

1. Fill in the blank: K-means is an unsupervised partitioning algorithm used to organize _____ data into clusters.

  • unlabeled
  • presorted
  • subcategorized
  • hierarchical

2. In k-means, what term describes the point at which each cluster is defined?

  • Commonality
  • Centroid
  • Core
  • Coordinate

3. A data professional is iterating on certain tasks that will enable them to create a k-means model. They continue doing this until the algorithm converges. Which step of the model-building process does this scenario represent?

  • Step four
  • Step three
  • Step two
  • Step one

Test your knowledge: Evaluate a K-means model

4. In a k-means model, which evaluation metric represents the sum of the squared distances between each observation and its closest centroid?

  • Silhouette score
  • SMAPE
  • F1-score
  • Inertia

5. Fill in the blank: A data professional may use the _____ method to choose an optimal value for k. This is a tool for identifying the point at which the decrease in inertia starts to level off.

  • partitioning
  • elbow
  • clustering
  • unsupervised learning

6. A data professional is using Scikit-learn to create a k-means model. Which attribute will enable them to get the cluster assignments?

  • Inertia
  • Labels
  • Fit
  • Silhouette score

Weekly challenge 3

7. Which of the following statements correctly describe key aspects of k-means? Select all that apply.

  • K-means is an unsupervised partitioning algorithm.
  • The value of k is a standard that never changes.
  • K-means clusters are defined by a central point, called a centroid.
  • To avoid poor clustering, data professionals run a k-means model with different starting positions for the centroids.

8. A data professional is recalculating the centroid of each cluster. Which step of the model-creation process are they working in?

  • Step four
  • Step one
  • Step three
  • Step two

9. Fill in the blank: In order to evaluate the intracluster space in a k-means model, a data professional uses the inertia metric. This is the _____ of the squared distances between each observation and its nearest centroid.

  • difference
  • sum
  • average
  • ratio

10. When creating a k-means model, what does it mean when an observation has a silhouette score coefficient with a value close to negative one?

  • The observation may be in the wrong cluster.
  • The observation is suitably within its own cluster and well separated from other clusters.
  • The observation is on the boundary between clusters.
  • The observation is in the correct cluster.

11. Which Python function fits a k-means model for multiple values of k by calculating the inertia for each value, appending it to a list, and returning that list?

  • silhouette score
  • cluster_image
  • k-means inertia
  • labels

12. Which of the following statements accurately describe the elbow method? Select all that apply.

  • When using the elbow method, data professionals aim to find the smoothest part of the curve.
  • The elbow method uses a line plot to visually compare the inertias of different models. 
  • There is not always an obvious elbow.
  • The sharpest bend in the curve is usually the model that will provide the most meaningful clustering of data.

13. A data professional is assigning each data point to its nearest centroid. Which step of the model-creation process are they working in?

  • Step one
  • Step three
  • Step four
  • Step two

14. Fill in the blank: In order to evaluate the _____ space in a k-means model, a data professional uses the inertia metric. This is the sum of the squared distances between each observation and its nearest centroid.

  • converged
  • midpoint
  • intracluster
  • intercluster

15. When creating a k-means model, what does it mean when an observation has a silhouette score coefficient with a value of zero?

  • The observation is in an appropriate cluster.
  • The observation may be in the wrong cluster.
  • The observation is suitably within its own cluster and well separated from other clusters.
  • The observation is on the boundary between clusters.

16. Which of the following statements correctly describe key aspects of k-means? Select all that apply.

  • The k-means clustering process has four steps that repeat until the model converges.
  • K-means organizes unlabeled data into clusters.
  • The position of the k-means centroid is the center of the cluster, also known as the mathematical mean.
  • K-means is a supervised partitioning algorithm.

17. Which of the following statements accurately describe the elbow method? Select all that apply.

  • There is always an obvious elbow.
  • The elbow method uses a line plot to visually compare the inertias of different models. 
  • When using the elbow method, data professionals find the sharpest bend in the curve.
  • The elbow method helps data professionals decide which clustering gives the most meaningful model.

18. A data professional is choosing the number of centroids to use in a k-means model and placing them in the data space. Which step of the model-creation process are they working in?

  • Step one
  • Step three
  • Step four
  • Step two

19. Fill in the blank: In order to evaluate the intracluster space in a k-means model, a data professional uses the _____ metric. This is the sum of the squared distances between each observation and its nearest centroid.

  • convergence
  • inertia
  • spread
  • silhouette score

20. Which metric would a data professional use to better understand the intracluster distance between data points and their centroids?

  • k-means inertia
  • silhouette score
  • cluster_image
  • labels

Leave a Reply