using python to access web data week 6 assignment
JSON and the REST Architecture (Chapter 13)
REST, JSON, and APIs
1. Who is credited with getting the JSON movement started?
- Mitchell Baker
- Bjarne Stroustrup
- Douglas Crockford
- Pooja Sankar
2. Which of the following is true about an API?
- An API keeps servers running even when the power is off
- An API defines the header bits in the first 8 bits of all IP packets
- An API defines the pin-outs for the USB connectors
- An API is a contract that defines how to use a software library
3. Which of the following is a web services approach used by the Twitter API?
- REST
- XML-RPC
- CORBA
- SOAP
4. What kind of variable will you get in Python when the following JSON is parsed:
- A list with three items
- A dictionary with three key / value pairs
- A dictionary with one key / value pair
- Three tuples
- One Tuple
5. Which of the following is not true about the service-oriented approach?
- An application runs together all in one place
- Web services and APIs are used to transfer data between applications
- Standards are developed where many pairs of applications must work together
- An application makes use of the services provided by other applications
6. If the following JSON were parsed and put into the variable x,
what Python code would extract "Leah Culver" from the JSON?
- x[“name”]
- x[0][“name”]
- x->name
- x[“users”][“name”]
- x[“users”][0][“name”]
7. What library call do you make to append properly encoded parameters to the end of a URL like the following:
http://maps.googleapis.com/maps/api/geocode/json?sensor=false&address=Ann+Arbor%2C+MI
- re.match()
- urllib.urlcat()
- re.encode()
- urllib.parse.urlencode()
8. What happens when you exceed the Google geocoding API rate limit?
- You cannot use the API until you respond to an email that contains a survey question
- Your application starts to perform very slowly
- The API starts to perform very slowly
- You cannot use the API for 24 hours
9. What protocol does Twitter use to protect its API?
- PKI-HMAC
- SOAP
- SHA1-MD5
- OAuth
- WS*Security
- Java Web Tokens
10. What header does Twitter use to tell you how many more API requests you can make before you will be rate limited?
- content-type
- x-rate-limit-remaining
- x-max-requests
- x-request-count-down
Extracting Data from JSON
Extracting Data from JSON In this assignment you will write a Python program somewhat similar to http://www.py4e.com/code3/json2.py. The program will prompt for a URL, read the JSON data from that URL using urllib and then parse and extract the comment counts from the JSON data, compute the sum of the numbers in the file and enter the sum below: We provide two files for this assignment. One is a sample file where we give you the sum for your testing and the other is the actual data you need to process for the assignment. Sample data: http://py4e-data.dr-chuck.net/comments_42.json (Sum=2553) Actual data: http://py4e-data.dr-chuck.net/comments_1913245.json (Sum ends with 29) You do not need to save these files to your folder since your program will read the data directly from the URL. Note: Each student will have a distinct data url for the assignment - so only use your own data url for analysis. Data Format The data consists of a number of names and comment counts in JSON as follows: { comments: [ { name: "Matthias" count: 97 }, { name: "Geomer" count: 97 } ... ] } The closest sample code that shows how to parse JSON and extract a list is json2.py. You might also want to look at geoxml.py to see how to prompt for a URL and retrieve data from a URL. Sample Execution $ python3 solution.py Enter location: http://py4e-data.dr-chuck.net/comments_42.json Retrieving http://py4e-data.dr-chuck.net/comments_42.json Retrieved 2733 characters Count: 50 Sum: 2... Turning in the Assignment Enter the sum from the actual data and your Python code below: Sum: (ends with 29) Python code:
import urllib.request
import json
url = input(“Enter location: “)
if len(url) < 1:
url = “http://py4e-data.dr-chuck.net/comments_42.json”
response = urllib.request.urlopen(url)
data = response.read().decode()
try:
json_data = json.loads(data)
except:
json_data = None
if not json_data:
print(“Failed to retrieve or parse data.”)
else:
comments = json_data.get(“comments”, [])
total_count = sum(comment.get(“count”, 0) for comment in comments)
print(“Retrieved”, len(data), “characters”)
print(“Count:”, len(comments))
print(“Sum:”, total_count)
Using the GeoJSON API
Calling a JSON API In this assignment you will write a Python program somewhat similar to http://www.py4e.com/code3/geojson.py. The program will prompt for a location, contact a web service and retrieve JSON for the web service and parse that data, and retrieve the first place_id from the JSON. A place ID is a textual identifier that uniquely identifies a place as within Google Maps. API End Points To complete this assignment, you should use this API endpoint that has a static subset of the Google Data: http://py4e-data.dr-chuck.net/json? This API uses the same parameter (address) as the Google API. This API also has no rate limit so you can test as often as you like. If you visit the URL with no parameters, you get "No address..." response. To call the API, you need to include a key= parameter and provide the address that you are requesting as the address= parameter that is properly URL encoded using the urllib.parse.urlencode() function as shown in http://www.py4e.com/code3/geojson.py Make sure to check that your code is using the API endpoint as shown above. You will get different results from the geojson and json endpoints so make sure you are using the same end point as this autograder is using. Test Data / Sample Execution You can test to see if your program is working with a location of "South Federal University" which will have a place_id of "ChIJNeHD4p-540AR2Q0_ZjwmKJ8". $ python3 solution.py Enter location: South Federal University Retrieving http://... Retrieved 6052 characters Place id ChIJNeHD4p-540AR2Q0_ZjwmKJ8 Turn In Please run your program to find the place_id for this location: Universitas Gadjah Mada Make sure to enter the name and case exactly as above and enter the place_id and your Python code below. Hint: The first seven characters of the place_id are "ChIJKZd ..." Make sure to retreive the data from the URL specified above and not the normal Google API. Your program should work with the Google API - but the place_id may not match for this assignment. place_id: Python code:
import urllib.request
import urllib.parse
import json
location = input(“Enter location: “)
if len(location) < 1:
print(“Location not provided.”)
quit()
service_url = “http://py4e-data.dr-chuck.net/json?”
parameters = {
‘address’: location,
‘key’: 42
}
url = service_url + urllib.parse.urlencode(parameters)
try:
response = urllib.request.urlopen(url)
data = response.read().decode()
print(“Retrieved”, len(data), “characters”)
try:
json_data = json.loads(data)
place_id = json_data[“results”][0][“place_id”]
print(“Place id”, place_id)
except json.JSONDecodeError:
print(“Failed to parse JSON”)
except KeyError:
print(“Place ID not found in JSON”)
except urllib.error.URLError:
print(“Failed to retrieve data”)