Nltk: OSError: stanford νŒŒμ„œ 예제λ₯Ό μ‚¬μš©ν•  λ•Œ Java λͺ…령이 μ‹€νŒ¨ν–ˆμŠ΅λ‹ˆλ‹€.

에 λ§Œλ“  2015λ…„ 12μ›” 25일  Β·  18μ½”λ©˜νŠΈ  Β·  좜처: nltk/nltk

μ•ˆλ…•ν•˜μ„Έμš”,

μŠ€νƒ ν¬λ“œ νŒŒμ„œ 예제λ₯Ό μ‹€ν–‰ν•˜λ €κ³  ν•©λ‹ˆλ‹€. 예

from nltk.parse.stanford import * 
dep_parser=StanfordDependencyParser(model_path="edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz")
[parse.tree() for parse in dep_parser.raw_parse("The quick brown fox jumps over the lazy dog.")]

λ§ˆμ§€λ§‰ λͺ…령을 μ‹€ν–‰ν•˜λ©΄ 였λ₯˜κ°€ λ°œμƒν•©λ‹ˆλ‹€.

OSError: Java command failed : [u'/usr/bin/java', u'-mx1000m', '-cp', ....

λͺ…λ Ήμ€„μ—μ„œ λ™μΌν•œ λͺ…령을 μž¬ν˜„ν•  λ•Œ Exception in thread "main" java.lang.NoClassDefFoundError: org/slf4j/LoggerFactory 였λ₯˜κ°€ λ°œμƒν•©λ‹ˆλ‹€.

λ”°λΌμ„œ slf4j-api.jar λ₯Ό _λͺ…λ Ήμ€„μ—μ„œ_ 클래슀 κ²½λ‘œμ— μΆ”κ°€ν•˜λ©΄ ꡬ문 뢄석이 μ„±κ³΅ν•©λ‹ˆλ‹€.

μ–΄λ–»κ²Œ slf4j-api.jar λ₯Ό nltk 클래슀 κ²½λ‘œμ— μΆ”κ°€ν•  수 μžˆμœΌλ―€λ‘œ ꡬ문 뢄석이 성곡할 수 μžˆμŠ΅λ‹ˆκΉŒ?

κ°μ‚¬ν•©λ‹ˆλ‹€!
즐거운 휴일 λ³΄λ‚΄μ„Έμš”

κ°€μž₯ μœ μš©ν•œ λŒ“κΈ€

'stanford_dir = st._stanford_jar.rpartition('/')[0]' λͺ…λ Ήμ—μ„œ 'st'λŠ” λ¬΄μ—‡μž…λ‹ˆκΉŒ?

λͺ¨λ“  18 λŒ“κΈ€

@yuvval Stanford Parser 버전 2015-12-09λ₯Ό μ‚¬μš©ν•˜κ³  μžˆλŠ”μ§€ ν™•μΈν•˜μ‹­μ‹œμ˜€. κ·Έλ ‡λ‹€λ©΄ 이 였λ₯˜λŠ” 이전보닀 더 λ§Žμ€ 쒅속성을 μ‚¬μš©ν•˜λŠ” μƒˆλ‘œμš΄ StanfordNLP둜 인해 λ°œμƒν•©λ‹ˆλ‹€. 이것은 #1237κ³Ό μœ μ‚¬ν•©λ‹ˆλ‹€.

#1237이 μˆ˜μ •λ˜κ³  NLTKκ°€ Standford 도ꡬλ₯Ό λ”°λΌμž‘κΈ° 전에 μž μ‹œ κΈ°λ‹€λ €μ•Ό ν•©λ‹ˆλ‹€.

λΉ λ₯Έ μˆ˜μ • μ†”λ£¨μ…˜μ€ λ‹€μŒ 쀑 ν•˜λ‚˜μž…λ‹ˆλ‹€.

  1. http://nlp.stanford.edu/software/stanford-parser-full-2015-04-20.zip 의 이전 버전 2015-04-20을 μ‚¬μš©ν•˜λ©΄ NLTK APIκ°€ μž‘λ™ν•©λ‹ˆλ‹€( http://stackoverflow.com μ°Έμ‘°)
  2. μŠ€νƒ ν¬λ“œ νŒŒμ„œ 클래슀 경둜λ₯Ό ν•΄ν‚Ήν•˜μ‹­μ‹œμ˜€.
from nltk.internals import find_jars_within_path
from nltk.parse.stanford import StanfordDependencyParser
dep_parser=StanfordDependencyParser(model_path="edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz")
stanford_dir = st._stanford_jar.rpartition('/')[0]
# or in windows comment the line above and uncomment the one below:
#stanford_dir = st._stanford_jar.rpartition("\\")[0]
stanford_jars = find_jars_within_path(stanford_dir)
st.stanford_jar = ':'.join(stanford_jars)
[parse.tree() for parse in dep_parser.raw_parse("The quick brown fox jumps over the lazy dog.")]

κ°μ‚¬ν•©λ‹ˆλ‹€! 2015-04-20 λ²„μ „μ—μ„œ μž‘λ™ν•©λ‹ˆλ‹€.

classpath 해킹도 μž‘λ™ν–ˆμŠ΅λ‹ˆκΉŒ?

μ‹œλ„ν•˜μ§€ μ•Šμ•˜μŠ΅λ‹ˆλ‹€. 방금 μ΅œμ‹  버전을 μ‚­μ œν•˜κ³  2015-04-20 버전을 λ‹€μš΄λ‘œλ“œν–ˆμŠ΅λ‹ˆλ‹€.

μ•ˆλ…•ν•˜μ„Έμš”! λ‚˜λŠ” λ‹Ήμ‹ μ˜ 해킹을 λ”°λ₯΄λ €κ³ ν–ˆμ§€λ§Œ λ‚˜μ—κ²ŒλŠ” `StanfordDependencyParser``κ°€ μ—†μŠ΅λ‹ˆλ‹€.

print(nltk.__version__)
from nltk.tag import StanfordDependencyParser

3.1
---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
<ipython-input-7-67bb74c3494a> in <module>()
----> 1 from nltk.tag import StanfordDependencyParser

ImportError: cannot import name 'StanfordDependencyParser'

이 문제λ₯Ό ν•΄κ²°ν•˜λŠ” 방법에 λŒ€ν•œ 아이디어가 μžˆμŠ΅λ‹ˆκΉŒ? μ΅œμ‹  μŠ€νƒ ν¬λ“œ 버전을 μ‚¬μš©ν•˜κ³  μ‹ΆμŠ΅λ‹ˆλ‹€.

@methodds μ˜€νƒ€λ₯Ό μš©μ„œν•˜μ‹­μ‹œμ˜€. from nltk.parse.stanford import StanfordDependencyParser μž…λ‹ˆλ‹€. μžμ„Έν•œ μ„€λͺ…은 https://gist.github.com/alvations/e1df0ba227e542955a8a λ₯Ό μ°Έμ‘°ν•˜μ„Έμš”.

링크 μ£Όμ…”μ„œ κ°μ‚¬ν•©λ‹ˆλ‹€. λΆˆν–‰νžˆλ„ λ‚΄ λ¦¬λˆ…μŠ€ 민트 OSμ—μ„œ μž‘λ™ν•˜λŠ” ν™˜κ²½ λ³€μˆ˜λ₯Ό 얻을 수 μ—†μŠ΅λ‹ˆλ‹€.

λ‚΄ bashrc λŠ” λ‹€μŒκ³Ό κ°™μŠ΅λ‹ˆλ‹€.

export JAVA_HOME="/usr/lib/jvm/java-8-oracle/"
export PATH=$JAVA_HOME/bin:$PATH

export CLASSPATH="/home/cs/stanford_nlp/stanford-postagger-full-2015-04-20/stanford-postagger.jar:$CLASSPATH"

export CLASSPATH="/home/cs/stanford_nlp/stanford-ner-2015-04-20/stanford-ner.jar:$CLASSPATH"

export STANFORD_MODELS="/home/cs/stanford_nlp/stanford-ner-2015-04-20/classifiers:$STANFORD_MODELS"

export STANFORD_MODELS="/home/cs/stanford_nlp/stanford-postagger-full-2015-04-20/models:$STANFORD_MODELS"

λ³€μˆ˜λ₯Ό λ°˜μ˜ν•˜λ©΄ λ‹€μŒκ³Ό 같이 μ˜¬λ°”λ₯΄κ²Œ λ³΄μž…λ‹ˆλ‹€.

echo $CLASSPATH
/home/cs/stanford_nlp/stanford-ner-2015-04-20/stanford-ner.jar:/home/cs/stanford_nlp/stanford-postagger-full-2015-04-20/stanford-postagger.jar

echo $STANFORD_MODELS
/home/cs/stanford_nlp/stanford-postagger-full-2015-04-20/models:/home/cs/stanford_nlp/stanford-ner-2015-04-20/classifiers

κ·ΈλŸ¬λ‚˜ (μž¬λΆ€νŒ… 후에도) NLTKλŠ” μ—¬μ „νžˆ νƒœκ±°λ₯Ό 찾지 λͺ»ν•©λ‹ˆλ‹€.

from nltk.tag.stanford import StanfordPOSTagger
st = StanfordPOSTagger('english-bidirectional-distsim.tagger')
st.tag('What is the airspeed of an unladen swallow ?'.split())

NLTK was unable to find stanford-postagger.jar! Set the CLASSPATH
environment variable.

source .bashrc λ₯Ό μˆ˜ν–‰ν•˜λ©΄ μž‘λ™ν•˜λŠ” λ™μ•ˆ http://apple.stackexchange.com/questions/12993/why-doesnt-bashrc-run-automatically μ—μ„œ bashrcκ°€ μž‘λ™ν•˜λŠ” 방식을 μ•Œμ•„λ³΄μ‹­μ‹œμ˜€.

νŒμ„ μ£Όμ…”μ„œ κ°μ‚¬ν•©λ‹ˆλ‹€. ν•˜μ§€λ§Œ 사전에 .bashrc μ†ŒμŠ€λ₯Ό μž…λ ₯ν–ˆλŠ”λ° μž‘λ™ν•˜μ§€ μ•Šμ•˜μŠ΅λ‹ˆλ‹€. λ‹€μ‹œ μ‹œλ„ν–ˆμ§€λ§Œ λΆˆν–‰νžˆλ„ μ—¬μ „νžˆ μž‘λ™ν•˜μ§€ μ•ŠμŠ΅λ‹ˆλ‹€.

Linux 배포판 및 버전은 λ¬΄μ—‡μž…λ‹ˆκΉŒ? lsb_release -a μžˆμŠ΅λ‹ˆκΉŒ? μ•„λ‹ˆλ©΄ Mac으둜 μž‘μ—…ν•˜κ³  μžˆμŠ΅λ‹ˆκΉŒ?

μ‘°μ‚¬ν•΄μ£Όμ…”μ„œ κ°μ‚¬ν•©λ‹ˆλ‹€. ` lsb_release -a λ°˜ν™˜

No LSB modules are available.
Distributor ID: LinuxMint
Description:    Linux Mint 17.3 Rosa
Release:    17.3
Codename:   rosa
  • export λͺ…령을 μ–΄λ””μ—μ„œ μˆ˜ν–‰ν–ˆμŠ΅λ‹ˆκΉŒ? μ–΄λ–€ 디렉토리?
  • python 슀크립트λ₯Ό μ–΄λ””μ—μ„œ μ‹€ν–‰ν•˜κ³  μžˆμŠ΅λ‹ˆκΉŒ? μ–΄λ–€ 디렉토리?

Python 슀크립트λ₯Ό μ‹€ν–‰ν•˜λ €λŠ” μœ„μΉ˜λ‘œ μ΄λ™ν•˜μ—¬ λ‹€μŒμ„ μˆ˜ν–‰ν•©λ‹ˆλ‹€. import os; print os.environ .

그런 λ‹€μŒ ν™ˆ λ””λ ‰ν† λ¦¬λ‘œ μ΄λ™ν•˜μ—¬ python을 μ‹œμž‘ν•˜κ³  λ™μΌν•œ μž‘μ—…μ„ μˆ˜ν–‰ν•©λ‹ˆλ‹€. import os; print os.environ

두 μ„ΈνŠΈμ˜ ν™˜κ²½ λ³€μˆ˜κ°€ λ‹€λ₯Έ 것이 λ³΄μ΄μ‹œλ‚˜μš”?

.bashrc μ—μ„œ 내보낸 ν™˜κ²½ λ³€μˆ˜λ₯Ό λ‚˜νƒ€λ‚΄μ§€ μ•Šμ€ import os; print(os.environ) 을 μ‚¬μš©ν•˜κΈ°λ₯Ό μ›ν–ˆλ˜ 것 κ°™μŠ΅λ‹ˆλ‹€. κ·Έ ν›„ λ‚΄μš©μ„ λ³΅μ‚¬ν•˜μ—¬ .profile (λ‚΄ ν™ˆ 폴더에 있음)에 λΆ™μ—¬λ„£κ³  이제 μ™„λ²½ν•˜κ²Œ μž‘λ™ν•©λ‹ˆλ‹€. μ™œ =D인지 λͺ¨λ₯΄κ² μŠ΅λ‹ˆλ‹€.

.profile μž‘λ™ν•΄μ„œ λ‹€ν–‰μž…λ‹ˆλ‹€. OS 배포판 문제인 것 κ°™μŠ΅λ‹ˆλ‹€. ν™˜κ²½ λ³€μˆ˜λ₯Ό μ •μ μœΌλ‘œ μ €μž₯ν•˜μ§€ μ•ŠλŠ” 것이 μ’‹μŠ΅λ‹ˆλ‹€. 개인적으둜 Python 슀크립트λ₯Ό μ‹œμž‘ν•  λ•Œλ§ˆλ‹€ λ‹€μ‹œ μ‹€ν–‰ν•˜μ—¬ 좩돌이 μ—†λŠ”μ§€ 확인할 수 μžˆμŠ΅λ‹ˆλ‹€. NLTK API와 Stanford λ„κ΅¬λ‘œ 즐거운 μ‹œκ°„μ„ λ³΄λ‚΄μ„Έμš”!

κ³ λ§™μŠ΅λ‹ˆλ‹€ :)

'stanford_dir = st._stanford_jar.rpartition('/')[0]' λͺ…λ Ήμ—μ„œ 'st'λŠ” λ¬΄μ—‡μž…λ‹ˆκΉŒ?

hansen7λ‹˜κ³Ό 같은 질문이 μžˆμŠ΅λ‹ˆλ‹€.

stκ°€ 무엇인지 μ°Ύκ³ μžˆλŠ” μ†Œμˆ˜μ˜ μ‚¬λžŒλ“€μ„ μœ„ν•΄,
st = StanfordNERTagger(os.environ.get('STANFORD_MODELS'))
μ°Έμ‘°: https://gist.github.com/manashmndl/810db10809cbc1209b34c7d25efe95d5#file -stanfordnertagger-py

이 νŽ˜μ΄μ§€κ°€ 도움이 λ˜μ—ˆλ‚˜μš”?
0 / 5 - 0 λ“±κΈ‰

κ΄€λ ¨ 문제

chaseireland picture chaseireland  Β·  3μ½”λ©˜νŠΈ

mwess picture mwess  Β·  5μ½”λ©˜νŠΈ

alvations picture alvations  Β·  4μ½”λ©˜νŠΈ

stevenbird picture stevenbird  Β·  3μ½”λ©˜νŠΈ

ndvbd picture ndvbd  Β·  4μ½”λ©˜νŠΈ