Corpora/stopwords not found when import nltk library
I trying to import the nltk package in python 2.7
import nltk
stopwords = nltk.corpus.stopwords.words('english')
print(stopwords[:10])
Running this gives me the following error:
LookupError:
**********************************************************************
Resource 'corpora/stopwords' not found. Please use the NLTK
Downloader to obtain the resource: >>> nltk.download()
So therefore I open my python termin and did the following:
import nltk
nltk.download()
Which gives me:
showing info https://raw.githubusercontent.com/nltk/nltk_data/gh-pages/index.xml
However this does not seem to stop. And running it again still gives me the same error. Any thoughts where this goes wrong?
You are currently trying to download every item in nltk data, so this can take long. You can try downloading only the stopwords that you need:
import nltk
nltk.download('stopwords')
Or from command line (thanks to Rafael Valero's answer):
python -m nltk.downloader stopwords
Reference:
- Installing NLTK Data - Command line installation
The some as mentioned here by Kurt Bourbaki but in the command line:
python -m nltk.downloader stopwords