Test your knowledge: Input validation

11. Data professionals use input validation to ensure data is complete, error-free, and of high-quality.

  • True
  • False

12. Fill in the blank: If a dataset lacks sufficient information to answer a business question, the process of _____ makes it possible to augment that data by adding values from other datasets.

  • summing
  • ssampling
  • joining
  • blending

13. In which phase of the PACE workflow would a data professional perform the majority of the data-validation process?

  • Execute
  • Analyze
  • Plan
  • Construct

Weekly challenge 3

14. Which of the following terms are used to describe missing data? Select all that apply.

  • Blank
  • NaN
  • N/A
  • Zero

15. Which of the following strategies might a data professional consider when handling missing data? Select all that apply.

  • Use their best judgment to add in values themselves.
  • Change the missing values to zeros.
  • Create a NaN category.
  • Delete the missing values.

16. A data professional writes the following code:

df.merge(df_zip,

how='left',

on=['date','center_point_geom'])

df_joined.head()

Which section of the code indicates the data frame to be merged with the dataset df?

  • center_point_geom
  • df_joined.head()
  • how=’left’
  • df_zip()

17. What tasks could the pandas function pd.isnull() be used for? Select all that apply.

  • To delete all of the values from a data frame
  • To identify when a value is missing from a data frame
  • To pull all of the missing values from a data frame
  • To change all values to nulls in a data frame

18. What type of outliers are values that are completely different from the overall data group and have no association with any other outliers?

  • Collective outliers
  • Global outliers
  • Dissimilar outliers
  • Contextual outliers

Shuffle Q/A 2

19. Fill in the blank: A data professional may work with categorical data by using _____, which is a data-transformation technique where each category is assigned a unique number instead of a qualitative value.

  • data blending
  • partitioning
  • label encoding
  • aliasing

20. What type of data visualization shows the concentration of values between two data points by illustrating their magnitude with two colors?

  • Density map
  • Heat map
  • Scatter plot
  • Treemap

Leave a Reply