error installing nltk supporting packages : nltk.download()

I have installed the nltk package. Following that I am trying to download the supporting packages using nltk.download() and am getting error:

[Errno 11001] getaddrinfo

My machine / software details are:

OS: Windows 8.1 Python: 3.3.4 NLTK Package: 3.0

Below are the commands run in python:

Python 3.3.4 (v3.3.4:7ff62415e426, Feb 10 2014, 18:13:51) [MSC v.1600 64 bit (AMD64)] on win32
Type "copyright", "credits" or "license()" for more information.

import nltk

nltk.download()
showing info http://nltk.github.com/nltk_data/
True

nltk.download("all")
[nltk_data] Error loading all: <urlopen error [Errno 11001]
[nltk_data]     getaddrinfo failed>
False

enter image description here

It looks like it is going to http://nltk.github.com/nltk_data/ whereas it should Ideally try to get the data from http://www.nltk.org/nltk_data/.

On another machine when we type http://nltk.github.com/nltk_data/ in the browser, it redirects to http://www.nltk.org/nltk_data/. I am not understanding why the redirection is not happening on my laptop.

I feel that this might be the issue.

Kindly help.

I have added the command prompt screenshot. Need help..

enter image description here

Regards, Bonson


Try below code. It has downloaded package as expected

import nltk
import ssl

try:
    _create_unverified_https_context = ssl._create_unverified_context
except AttributeError:
    pass
else:
    ssl._create_default_https_context = _create_unverified_https_context

nltk.download()

Looks before link was broken whicvh been fixed by ssl.

Note :- MAC been used


I got this error because of network constraint. Here is how I solved

Browsed http://www.nltk.org/nltk_data/ and downloaded required corpora from the corresponding link.

Then placed the downloaded files in C:/ folder path in windows (or any other relevant directories like C:/ProgramData/Anaconda3) in a same folder structure mentioned in https://github.com/nltk/nltk_data/tree/gh-pages/packages


Got the solution. The issue in my case was that when the NLTK downloader started it had the server index as - http://nltk.github.com/nltk_data/

This needs to be changed to - http://nltk.org/nltk_data/

You can change this by going into the NLTK Downloader window and the File->Change Server Index.

Regards, Bonson


it resolved issues for me by "setting http & https proxy in environment variables"

set http_proxy=http://IPN:PWD@ipaddress:port
set https_proxy=https://IPN:PWD@ipaddress:port

ask your network or admin team for this proxy IP address