Hi,
I am trying to run the stanford parser example. E.g.
from nltk.parse.stanford import *
dep_parser=StanfordDependencyParser(model_path="edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz")
[parse.tree() for parse in dep_parser.raw_parse("The quick brown fox jumps over the lazy dog.")]
executing the last command results with an error:
OSError: Java command failed : [u'/usr/bin/java', u'-mx1000m', '-cp', ....
when I reproduce the same command on the command line, I get the error Exception in thread "main" java.lang.NoClassDefFoundError: org/slf4j/LoggerFactory
Therefore, after adding slf4j-api.jar
to the classpath _on the commandline_, parsing is successful.
How can slf4j-api.jar
be added to nltk classpath, so parsing will be successful?
Thank you!
Happy holidays
@yuvval Just to be sure are you using Stanford Parser version 2015-12-09? If so, this error occurs because of the new StanfordNLP using more dependencies than before. This is similar to #1237
You would have to wait for a while before #1237 is fixed and NLTK catches up with Standford tools.
The quick fix solution is to either:
from nltk.internals import find_jars_within_path
from nltk.parse.stanford import StanfordDependencyParser
dep_parser=StanfordDependencyParser(model_path="edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz")
stanford_dir = st._stanford_jar.rpartition('/')[0]
# or in windows comment the line above and uncomment the one below:
#stanford_dir = st._stanford_jar.rpartition("\\")[0]
stanford_jars = find_jars_within_path(stanford_dir)
st.stanford_jar = ':'.join(stanford_jars)
[parse.tree() for parse in dep_parser.raw_parse("The quick brown fox jumps over the lazy dog.")]
Thank you! It works with the 2015-04-20 version.
Did the classpath hack also work?
I didn't try - I just deleted the latest version and downloaded the 2015-04-20 version.
Hi! I tried to follow your hack but for me there is no `StanfordDependencyParser``:
print(nltk.__version__)
from nltk.tag import StanfordDependencyParser
3.1
---------------------------------------------------------------------------
ImportError Traceback (most recent call last)
<ipython-input-7-67bb74c3494a> in <module>()
----> 1 from nltk.tag import StanfordDependencyParser
ImportError: cannot import name 'StanfordDependencyParser'
Any idea how to solve this? I would really like to use the latest stanford version.
@methodds Pardon my typo, it's from nltk.parse.stanford import StanfordDependencyParser
. Please see https://gist.github.com/alvations/e1df0ba227e542955a8a for detailed explanations.
Thank you for the link. Unfortunately, I can't get the environment variables to work on my linux mint os.
My bashrc
looks like this:
export JAVA_HOME="/usr/lib/jvm/java-8-oracle/"
export PATH=$JAVA_HOME/bin:$PATH
export CLASSPATH="/home/cs/stanford_nlp/stanford-postagger-full-2015-04-20/stanford-postagger.jar:$CLASSPATH"
export CLASSPATH="/home/cs/stanford_nlp/stanford-ner-2015-04-20/stanford-ner.jar:$CLASSPATH"
export STANFORD_MODELS="/home/cs/stanford_nlp/stanford-ner-2015-04-20/classifiers:$STANFORD_MODELS"
export STANFORD_MODELS="/home/cs/stanford_nlp/stanford-postagger-full-2015-04-20/models:$STANFORD_MODELS"
Echoing the variables looks right:
echo $CLASSPATH
/home/cs/stanford_nlp/stanford-ner-2015-04-20/stanford-ner.jar:/home/cs/stanford_nlp/stanford-postagger-full-2015-04-20/stanford-postagger.jar
echo $STANFORD_MODELS
/home/cs/stanford_nlp/stanford-postagger-full-2015-04-20/models:/home/cs/stanford_nlp/stanford-ner-2015-04-20/classifiers
However (even after rebooting) NLTK still does not find the tagger:
from nltk.tag.stanford import StanfordPOSTagger
st = StanfordPOSTagger('english-bidirectional-distsim.tagger')
st.tag('What is the airspeed of an unladen swallow ?'.split())
NLTK was unable to find stanford-postagger.jar! Set the CLASSPATH
environment variable.
Do source .bashrc
and it will work meanwhile take a look at http://apple.stackexchange.com/questions/12993/why-doesnt-bashrc-run-automatically to learn how bashrc works.
Thank you for your tip, but I did source .bashrc beforehand and it did not work. I tried it again and unfortunately it's still not working.
What is your Linux distribution and version? Can you do a lsb_release -a
? Or are you working with a Mac?
Thank you for investigating. `lsb_release -a
returns
No LSB modules are available.
Distributor ID: LinuxMint
Description: Linux Mint 17.3 Rosa
Release: 17.3
Codename: rosa
export
commands? Which directory?Go to the place where you want to run your python script, do this: import os; print os.environ
.
Then go to your home directory, start python and do the same: import os; print os.environ
Do you see the 2 sets of environment variables differ?
I guess you wanted me to use import os; print(os.environ)
, which did not reveal the environment variables that I exported in .bashrc
. After that I copy pasted the content into .profile
(in my home folder) and now it works perfectly. I have no idea why though =D.
Glad that .profile
works, i think it's a OS distro issue. I would not recommend to store the environment variables as static, personally, I rerun them everytime I start my python scripts, so that I can sure that there's no conflict. Have fun with the NLTK API and Stanford tools!
Thank you :)
what is 'st' in the command 'stanford_dir = st._stanford_jar.rpartition('/')[0]'
I have the same question as hansen7
for few who have been looking what is st,
st = StanfordNERTagger(os.environ.get('STANFORD_MODELS'))
Ref: https://gist.github.com/manashmndl/810db10809cbc1209b34c7d25efe95d5#file-stanfordnertagger-py
Most helpful comment
what is 'st' in the command 'stanford_dir = st._stanford_jar.rpartition('/')[0]'