Week 1: Organizing data to begin analysis
1. What is the goal of the analysis phase of the data analysis process?
- To describe data structures
- To generate new data
- To identify trends and relationships in data
- To make generalizations about data
2. During which of the four phases of analysis do you compare your data to external sources?
- Format and adjust data
- Transform data
- Get input from others
- Organize data
3. Which of the following actions might occur when transforming data? Select all that apply.
- Identify a pattern in your data
- Make calculations based on your data
- Recognize relationships in your data
- Eliminate irrelevant info from your data
4. Typically, a data analyst uses filters when they want to expand the amount of data they are working with.
- True
- False
5. A data analyst is sorting data in a spreadsheet. They select a specific collection of cells in order to limit the sorting to just specified cells. Which spreadsheet tool are they using?
- Sort Sheet
- Sort Range
- Limit Sort
- Limit Range
6. A data analyst sorts a spreadsheet range between cells D5 and M5. They sort in descending order by the third column, Column F. What is the syntax they are using?
- =SORT(D5:M5, C, TRUE)
- =SORT(D5:M5, 3, FALSE)
- =SORT(D5:M5, C, FALSE)
- =SORT(D5:M5, 3, TRUE)
7. You are querying a database that contains data about music. Each musical genre is given an ID number. You are only interested in data related to the genre with ID number 7. The genre IDs are listed in the genre_id column.
You write the SQL query below. Add a WHERE clause that will return only data about the genre with Id number 7.
Who is the composer listed in row 4 of your query result?
- Caetano Veloso
- Marisa Monte
- Lulu Santos
- Gilberto Gil
8. You are working with a database that contains invoice data about online music purchases. You are only interested in invoices sent to customers located in the city of Delhi. You want to sort the invoices by order total in ascending order. The order totals are listed in the total column.
You write the SQL query below. Add an ORDER BY clause that will sort the invoices by order total in ascending order.
What total appears in row 4 of your query result?
- 1.98
- 5.94
- 8.91
- 3.96
9. Fill in the blank: The _____ phase of the data analysis process includes organizing data, formatting and adjusting data, getting input from others, and transforming data by observing relationships between data points and making calculations.
- process
- prepare
- analyze
- act
10. During which of the four phases of analysis do you gather the relevant datasets into an usable structure for a project?
- Format and adjust data
- Get input from others
- Transform data
- Organize data
11. Fill in the blank: Sorting ranks data based on a specific _____ that you select.
- calculation
- observation
- metric
- model
12. A data analyst is sorting data in a spreadsheet. Which tool are they using if all of the data is sorted by the ranking of a specific sorted column and data across rows is kept together?
- Sort Sheet
- Sort Together
- Sort Rank
- Sort Document
13. A data analyst sorts a spreadsheet range between cells A1 and E50. They sort in descending order by the fourth column, Column D. What is the syntax they are using?
- =SORT(A1:E50, 4, FALSE)
- =SORT(A1:E50, 4, TRUE)
- =SORT(A1:E50, D, TRUE)
- =SORT(A1:E50, D, FALSE)
14. You are querying a database that contains data about music. You are only interested in data related to the jazz musician Miles Davis. The names of the musicians are listed in the composer column.
You write the following SQL query, but it is incorrect. What is wrong with the query?
SELECT *
FROM Track
WHERE composer = Miles Davis
- Line 3 should be rewritten as WHERE composer is Miles Davis.
- Composer in line 3 should be capitalized.
- SELECT, FROM, and WHERE should not be capitalized.
- Miles Davis should be in double quotation marks.
15. You are working with a database that contains invoice data about online music purchases. You are only interested in invoices sent to customers located in the city of Paris. You want to sort the invoices by order total in ascending order. The order totals are listed in the total column.
You write the SQL query below. However this query is incorrect. What is wrong with it?
SELECT *
FROM invoice
WHERE billing_city = “Paris”
ORDER total
- SELECT, FROM, WHERE, and ORDER are capitalized.
- Line 4 is missing the text column = between ORDER and total.
- In line 3, “Paris” has quotation marks.
- Line 4 is missing the word BY between ORDER and total.
16. After collecting the relevant datasets for their analysis, a data analyst compares this data to external sources. In which of the four phases of analysis does this occur?
- Organize data
- Format and adjust data
- Transform data
- Get input from others
17. A data analyst working on a data set is investigating possible relationships in the data. What phase of analysis is the analyst in?
- Format and adjust data
- Get input from others
- Transform data
- Organize data
18. A data analyst sorts a spreadsheet range between cells K9 and L20. They sort in ascending order by the first column, Column K. What is the syntax they are using?
- =SORT(K9:L20, K, TRUE)
- =SORT(K9:L20, K, FALSE)
- =SORT(K9:L20, 1, TRUE)
- =SORT(K9:L20, 1, FALSE)
19. You are querying a database that contains data about music. Each album is given an ID number. You are only interested in data related to the album with ID number 3. The album IDs are listed in the album_id column.
You write the following SQL query, but it is incorrect. What is wrong with the query?
SELECT *
FROM Track
WHERE album = 3
- In line 3, album should be album_id.
- SELECT, FROM, and WHERE should be capitalized.
- In line 3, album is not capitalized.
- Line 3 contains an equal sign.
20. In the data analysis process, which of the following refers to a phase of analysis? Select all that apply.
- Format data using sorts and filters
- Get input from others
- Organize data into understandable sections
- Visualize the data
21. A data analyst is collecting all the datasets that are relevant to their project. Which of the four phases of analysis is the data analyst in?
- Get input from others
- Organize data
- Format and adjust data
- Transform data
22. A data analyst investigating a data set is interested in showing only data that matches given criteria. What is this known as?
- Sorting
- Modeling
- Measuring
- Filtering
23. You are working with a database that contains invoice data about online music purchases. You are only interested in invoices sent to customers located in the city of Delhi. You want to sort the invoices by order total in ascending order. The order totals are listed in the total column.
You write the SQL query below. However this query is incorrect. What is wrong with it?
SELECT *
FROM invoice
WHERE billing_city = “Delhi”
ORDER BY order_total
- SELECT, FROM, WHERE, and ORDER BY are capitalized.
- In line 4, order_total should be total.
- In line 3, “Delhi” has quotation marks.
- Line 4 contains the word BY.
24. A data analyst chooses to rank the data based on a specific metric. What is the term for this action?
- Sorting
- Filtering
- Modeling
- Measuring
25. A data analyst investigates the data they’ve collected to look for patterns and relationships between the data. They also perform calculations based on the data. In which of the four phases of analysis does this occur?
- Format and adjust data
- Transform data
- Get input from others
- Organize data
26. A data analyst working on a very large dataset decides to narrow the scope of the data that they are working with in order to make the analysis more manageable. What can they use to narrow the amount of data?
- Modeling
- Sorting
- Filtering
- Measuring
27. A data analyst uses a function to sort a spreadsheet range between cells H1 and K65. They sort in ascending order by the first column, Column H. What is the syntax they are using?
- =SORT(H1:K65, 1, FALSE)
- =SORT(H1:K65, A, TRUE)
- =SORT(H1:K65, A, FALSE)
- =SORT(H1:K65, 1, TRUE)
28. You are querying a database that contains data about music. Each musical genre is given an ID number. You are only interested in data related to the genre with ID number 2. The genre IDs are listed in the genre_id column.
You write the following SQL query, but it is incorrect. What is wrong with the query?
SELECT *
FROM Track
WHERE composer = 2
- Line 3 contains an equal sign.
- Composer should be genre_id in line 3.
- Composer is not capitalized in line 3.
- SELECT, FROM, and WHERE are capitalized.
29. You are performing a calculation during your analysis of a dataset. Which phase of analysis are you in?
- Get input from others
- Format and adjust data
- Organize data
- Transform data
30. A data analyst is sorting spreadsheet data. They use the spreadsheet tool Sort Sheet. What does this tool do?
- It sorts all of the data in a spreadsheet by a specific sorted column.
- It sorts all of the data in a spreadsheet by the ranking of a specific sorted row.
- It allows the analyst to sort by a specific sorted row.
- It allows the analyst to sort a specific selection of cells only.
31. Which of the following tasks would a data analyst perform during the analyze phase of the data analysis process? Select all that apply.
- Getting input from others
- Organizing data into understandable sections
- Visualizing the data with charts
- Preparing a report for the stakeholders
32. You write the SQL query below. However this query is incorrect. What is wrong with it?
SELECT *
FROM invoice
WHERE billing_city = “Chicago”
ORDER total
- Line 4 is missing column = between ORDER and total.
- SELECT, FROM, WHERE, and ORDER are capitalized.
- Line 4 is missing the BY between ORDER and total.
- In line 3, “Chicago” has quotation marks.
33. A data analyst is analyzing sales data to identify trends and relationships. What phase of the data analysis process does this describe?
- Analyze
- Act
- Process
- Prepare
34. A data analyst sorts a spreadsheet range between cells A15 and G71. They sort in ascending order by the second column, Column B. What is the syntax they are using?
- =SORT(A15:G71, 2, FALSE)
- =SORT(A15:G71, 2, TRUE)
- =SORT(A15:G71, B, FALSE)
- =SORT(A15:G71, B, TRUE)
35. A data analyst is using the spreadsheet tool Sort Range. What purpose does this tool serve?
- It allows the analyst to sort the data in a spreadsheet by a specific sorted column.
- It allows the analyst to sort a specific selection of cells only.
- It sorts all of the data in a spreadsheet by a specific sorted row.
- It sorts all of the data in a spreadsheet by the ranking of a specific sorted row.
36. A data analyst sorts a spreadsheet range between cells F19 and G82. They sort in ascending order by the second column, Column G. What is the syntax they are using?
- =SORT(F19:G82, B, FALSE)
- =SORT(F19:G82, 2, TRUE)
- =SORT(F19:G82, B, TRUE)
- =SORT(F19:G82, 2, FALSE)