Skin vs. Food Problem
Data Science starts with? …. Data!
So lets get some images from the web!
Getting Photos from Instagram
Get an Instagram Client ID/SECRET
Go to instagram’s developer page, create an account and register a new application.
Just give it an Application Name, a Description, a Website (you can use your personal website, reuse the website in the OAuth redirect_uri
field. This should create a CLIENT ID and a CLIENT SECRET for you.
Install the Instagram Python Client
https://github.com/Instagram/python-instagram
To install do:
sudo pip install python-instagram
Fire up an ipython notebook
So, we will import python-instagram, and then set up an api connection. Then we will grab the images from a restaurant, say “Slanted Door”
Let’s see what is happening here!
- First, we import and initialize the Instagram API with the
client_id
and theclient_secret
obtained when registering your app with instagram. - Then, we did a location search with something called a
foursquare_v2_id
. We are doing this because it is easier to find the foursquare id associated with a restaurant, and then derive the instagram location id. To find the FourSquare ID got to foursquare.com and search for the place. The url will have the id. For example the FourSquare url for “Slanted Door” washttps://foursquare.com/v/slanted-door/3fd66200f964a52018ed1ee3
which explains the ID I used. - With that location ID, we then do a media search, which gives us a bunch of media objects plus a link to the next page as
next
- Then, we cycled through the media list, and got the urls for the standard_resolution images.
- Note that the
next
variable contains a newmax_id
, which we can use to get the next page of images. - So, in the next call, we would use:
1
|
|
Now, you can easily loop through the pages to get all the images for this location. Note, that you need to throttle the requests to instagram so that you roughly do one request per second. Use the time.sleep
function to limit the requests.
Let’s try to get the urls for the photos for these 20 restaurants
- Slanted Door
- House of Prime Rib
- Greens Restaurant
- Mustards Grill
- Bouchon
- Sutro’s at the Cliff House
- Bistro Jeanty
- Boulevard
- Waterbar
- The Dead Fish
- Ad Hoc
- Farallon
- Wayfare Tavern
- Kokkari Estiatorio
- Zuni Cafe
- Skates on the Bay
- The Stinking Rose
- Foreign Cinema
- Kuleto’s
- Redd
Feature Selection
Finding connected blobs
Scipy has a tool called labels which can be very useful in detecting connected pixel blobs.
Check this out: