11. What is data called that does not fit within the context of the use case?

  • Relevant data
  • Irrelevant data
  • Missing data
  • Duplicate data

12. What does a typical data wrangling workflow include?

  • Validating the quality of the transformed data
  • Recognizing patterns
  • Predicting probabilities
  • Using mathematical techniques to identify correlations in data

13. OpenRefine is an open-source tool that allows you to:

  • Automatically detect schemas, data types, and anomalies
  • Enforces applicable data governance policies automatically
  • Transform data into a variety of formats such as TSV, CSV, XLS, XML, and JSON
  • Use add-ins such as Microsoft Power Query to identify issues and clean data

14. What is one of the steps in a typical data cleaning workflow?

  • Clustering data
  • Building classification models
  • Inspecting data to detect issues and errors
  • Establishing relationships between data events

15. When you’re combining rows of data from multiple source tables into a single table, what kind of data transformation are you performing?

  • Joins
  • Normalization
  • Denormalization
  • Unions

16. When you detect a value in your data set that is vastly different from other observations in the same data set, what would you report that as?

  • Missing value
  • Irrelevant data
  • Syntax error
  • Outlier

