data analysis with r coursera week 2 quiz answers

Practice Quiz

1. The process of converting or mapping data from the initial raw form to another format to prepare it for further analysis goes by several names. What is this process commonly called? Select three answers.

Answers

Data pre-processing

Data wrangling

Data formatting
Data cleaning

2. What is the result of the following statement?

sub_airline %>% map(~sum(is.na(.)))

Answers

Counts the missing values in all columns in the dataset.
Counts all instances of zero in all columns in the dataset.
Counts all instances of NA in all columns in the dataset.
Counts the missing values and returns the result only for columns in the dataset that have missing values.

3. Which functions do you use together to correct data types in all columns of your dataset? Select two answers.

Answers

mutate_if()

sapply()
mutate()
mutate_all()

4. Which data normalization technique divides each value by the maximum value for that variable, resulting in new values that range between 0 and 1?

Answers

Z-score
Min-max
Simple feature scaling

5. With data binning, observations are often organized into defined intervals called quartiles. Which quartile is the median of the dataset?

Answers

4th quartile
3rd quartile
1st quartile
2nd quartile

Graded Quiz

6. You want to access the “Date” column of a data frame called sales_data so you can perform an operation on it. What is the correct way to refer to this column?

Answers

sales_data.Date
sales_data#Date
sales_data$Date
sales_data%Date

7. Which function replaces missing values in a dataset?

Answers

drop_columns()
is.na()
drop_na()
replace_na()

8. You have a variable called “Status” that contains a status code in the format “error_type-severity_level”, for example “10-07”, and you want to reformat the column so that the “error_type” and “severity_level” are in different columns. What is the correct function to do this?

Answers

dataframe %>% mutate_if(Status, sep = “-“,
into = c(“error_type”, “severity_level”)
dataframe %>% mutate_all(Status, sep = “-“,
into = c(“error_type”, “severity_level”)
dataframe %>% separate(Status, sep = “-“,
into = c(“error_type”, “severity_level”)
dataframe %>% sapply(Status, sep = “-“,
into = c(“error_type”, “severity_level”)

9. What are two benefits of data normalization?

Answers

Minimize the effects of outliers, which can influence the result more.

Helps you better understand data distribution.
Brings data into a common standard of expression that allows you to make meaningful comparisons.
Enables a fair comparison between the different features and making sure they have the same impact.

10. To visualize its distribution, binned data is often plotted in which of the following type of chart?

Answers

Histogram
Scatter plot
Bar chart
Line chart

11. Which of the following can you accomplish using the spread() function? Select two answers.

Answers

Reformat the categorical variable that its contents are in two or more columns.
Convert categorical variables to dummy variables and assign the value of another variable to each category.

Size down three variables into one.
Convert categorical variables to dummy variables.

Leave a Reply Cancel reply