How to get chosen class images from Imagenet?

The question is how to get images of the chosen class #50: 'American alligator, Alligator mississipiensis' from ImageNet.

  1. Go to image-net.org.

  2. Go to "Download".

  3. Follow the instructions for "Download Image URLs":

enter image description here

How to download the URLs of a synset from your Brower?

1. Type a query in the Search box and click "Search" button

enter image description here

enter image description here

The alligator is not shown. ImageNet is under maintenance. Only ILSVRC synsets are included in the search results. No problem, we are fine with the similar animal "alligator lizard", since this search is about getting to the right branch of the WordNet treemap. I do not know whether you will get the direct ImageNet images here even if there were no maintenance.

2. Open a synset papge

enter image description here

Scrolling down:

enter image description here

Scrolling down:

enter image description here

Searching for the American alligator, which happens to be a saurian diapsid reptile as well, as a near neighbour:

enter image description here

3. You will find the "Download URLs" button under the left-bottom corner of the image browsing window.

enter image description here

You will get all of the URLs with the chosen class. A text file pops up in the browser:

http://image-net.org/api/text/imagenet.synset.geturls?wnid=n01698640

We see here that it is just about knowing the right WordNet id that needs to be put at the end of the URL.

Manual image download

The text file looks as follows:

enter image description here

  • http://farm1.static.flickr.com/136/326907154_d975d0c944.jpg
  • http://weeksbay.org/photo_gallery/reptiles/American20Alligator.jpg
  • ...
  • till image number 1261.

As an example, the first URL links to:

enter image description here

And the second is a dead link:

enter image description here

The third link is dead, but the fourth is working.

enter image description here

The images of these URLs are publicly available, but many links are dead, and the pictures are of lower resolution.

Automated image download

From the ImageNet guide again:

How to download by HTTP protocol? To download a synset by HTTP request, you need to obtain the "WordNet ID" (wnid) of a synset first. When you use the explorer to browse a synset, you can find the WordNet ID below the image window.(Click Here and search "Synset WordNet ID" to find out the wnid of "Dog, domestic dog, Canis familiaris" synset). To learn more about the "WordNet ID", please refer to

Mapping between ImageNet and WordNet

Given the wnid of a synset, the URLs of its images can be obtained at

http://www.image-net.org/api/text/imagenet.synset.geturls?wnid=[wnid]

You can also get the hyponym synsets given wnid, please refer to API documentation to learn more.

So what is in that API documentation?

There is everything needed to get all of the WordNet IDs (so called "synset IDs") and their words for all synsets, that is, it has any class name and its WordNet ID at hand, for free.

Obtain the words of a synset

Given the wnid of a synset, the words of the synset can be obtained at

http://www.image-net.org/api/text/wordnet.synset.getwords?wnid=[wnid]

You can also Click Here to download the mapping between WordNet ID and words for all synsets, Click Here to download the mapping between WordNet ID and glosses for all synsets.

If you know the WordNet ids of choice and their class names, you can use the nltk.corpus.wordnet of "nltk" (natural language toolkit), see the WordNet interface.

In our case, we just need the images of class #50: 'American alligator, Alligator mississipiensis', we already know what we need, thus we can leave the nltk.corpus.wordnet aside (see tutorials or Stack Exchange questions for more). We can automate the download of all alligator images by looping through the URLs that are still alive. We could also widen this to the full WordNet with a loop over all WordNet IDs, of course, though this would take far too much time for the whole treemap - and is also not recommended since the images will stop being there if 1000s of people download them daily.

I am afraid I will not take the time to write this Python code that accepts the ImageNet class number "#50" as the argument, though that should be possible as well, using mapping tables from WordNet to ImageNet. Class name and WordNet ID should be enough.

For a single WordNet ID, the code could be as follows:

import urllib.request 
import csv

wnid = "n01698640"
url = "http://image-net.org/api/text/imagenet.synset.geturls?wnid=" + str(wnid)

# From https://stackoverflow.com/a/45358832/6064933
req = urllib.request.Request(url, headers={'User-Agent': 'Mozilla/5.0'})
with open(wnid + ".csv", "wb") as f:
    with urllib.request.urlopen(req) as r:
        f.write(r.read())

with open(wnid + ".csv", "r") as f:
    counter = 1
    for line in f.readlines():      
        print(line.strip("\n"))
        failed = []
        try:
            with urllib.request.urlopen(line) as r2:
                with open(f'''{wnid}_{counter:05}.jpg''', "wb") as f2:
                    f2.write(r2.read())
        except:
            failed.append(f'''{counter:05}, {line}'''.strip("\n"))
        counter += 1
        if counter == 10:
            break

with open(wnid + "_failed.csv", "w", newline="") as f3:
    writer = csv.writer(f3)
    writer.writerow(failed)

Result:

enter image description here

  1. If you need the images even behind the dead links and in original quality, and if your project is non-commercial, you can sign in, see "How do I get a copy of the images?" at the Download FAQ.
  • In the URL above, you see the wnid=n01698640 at the end of the URL which is the WordNet id that is mapped to ImageNet.
  • Or in the "Images of the Synset" tab, just click on "Wordnet IDs".

enter image description here

To get to:

enter image description here

or right-click -- save as:

enter image description here

You can use the WordNet id to get the original images.

enter image description here

If you are commercial, I would say contact the ImageNet team.


Add-on

Taking up the idea of a comment: If you do not want many images, but just the "one single class image" that represents the class as much as possible, have a look at Visualizing GoogLeNet Classes and try to use this method with the images of ImageNet instead. Which is using the deepdream code as well.

Visualizing GoogLeNet Classes

  1. July 2015

Ever wondered what a deep neural network thinks a Dalmatian should look like? Well, wonder no more.

Recently Google published a post describing how they managed to use deep neural networks to generate class visualizations and modify images through the so called “inceptionism” method. They later published the code to modify images via the inceptionism method yourself, however, they didn’t publish code to generate the class visualizations they show in the same post.

While I never figured out exactly how Google generated their class visualizations, after butchering the deepdream code and this ipython notebook from Kyle McDonald, I managed to coach GoogLeNet into drawing these:

enter image description here

... [with many other example images to follow]