sql for data science with r coursera answers week 5
Practice Quiz
1. You are preparing to analyze some sales data and have a large amount of information in an Excel spreadsheet. You have decided to convert the Excel spreadsheet to a relational database. What is your first step?
- Create the physical database objects.
- Create a logical and physical database design. Â
- Get the data into the database.
- Clean and split the data into load files.
2. What is a referential constraint?
- This occurs when there is a primary key/foreign key relationship between two tables. Â
- This occurs when there is a many-to-many relationship between tables.Â
- This occurs when one table can’t find a matching entry in another table.Â
- This occurs when there is a one-to-many relationship between tables.Â
3. Which RODBC function can you use to create a new table in a database from R?
- sqlQuery()
- sqlAddTable()
- CREATE TABLE
- You cannot do this from R. You need to use a database management tool to do this.
4. Why is the SQL LOAD command recommended over the IMPORT command for large amounts of data?
- The LOAD command is not recommended for large amounts of data. IMPORT is a much better choice.
- The LOAD command bypasses the database transactional logging mechanism, making it fast and efficient.Â
- The LOAD command performs SQL INSERT statements for a group of rows and commits them periodically, saving transaction process time. Â
- The LOAD command uses transactional logging, making it fast and efficient.Â
5. After you query a database, how do you load query results into a dataframe so you can perform data analysis? Select two answers.
- Use the sqlFetch() function.Â
- Use the sqlCommit() function.
- Use the sqlQuery() function.
- Use the sqlLoad() function.
Â
Graded Quiz: Creating Database Objects and Querying Data from R
6. What are two reasons to map an existing data source, like pre-existing database tables, database dump files, or raw data, to a relational database design? Select two answers.
- Eliminate redundancy in the data.
- Address issues with data normalization.Â
- There’s no reason to do this. It just adds a lot of extra work with little benefit.Â
- Limit the number of concurrent users to just you so it’s secure.
Â
7. You have two tables in your database design: Customers, which lists all your customers, and Orders, which lists all the sales transactions that your customers have made over the years. The two tables each have a field called Customer_ID. Which of the following correctly describe the relationship between the two tables?
- Both the Customer_ID field in the Customers table and the Customer_ID field in the Orders table are Foreign Keys.
- The Customer_ID field in the Customers table is a Primary Key and the Customer_ID field in the Orders table is a Foreign Key.
- The Customer_ID field in the Customers table is a Foreign Key and the Customer_ID field in the Orders table is a Primary Key.
- Both the Customer_ID field in the Customers table and the Customer_ID field in the Orders table are Primary Keys.
8. What is the SQL DDL command can be used for adding primary keys to an existing table in a database?
- CREATEÂ TABLEÂ
- SQL QUERYÂ
- ALTER TABLE
- UPDATE TABLE
9. What is the recommend SQL command for loading small to medium amounts of data into a database?
- LOAD command
- IMPORT commandÂ
- LOAD or IMPORT (there is no difference)Â
10. What are two ways to limit database movement and increase performance when querying a database?
- Use stored procedures when possible.Â
- Use SQL functions provided by the database vendor whenever possible.Â
- Use the sqlQuery() or sqlFetch() commands.Â
- Load all the data directly into a dataframe to reduce the number of times you must revisit the database.
Â